luke.bace (66) [Avatar] Offline
#1
Please post all errata here. Thanks.
andaluri (4) [Avatar] Offline
#2
Re: Errors and Corrections
On Page 49, Section 3.2.2, I think there is a typo in the text shown below:

"The size field defines the size of the sub-buffer. The following code creates a sub-buffer containing 40 ints from a buffer object containing 100 ints. The start of the sub-buffer data is the 50th float in the main buffer.

cl_buffer_region region;
region.size = 40*sizeof(float);
region.origin = 50*sizeof(float);
"

I'm assuming the "ints" (shown in bold) should have been "floats" isn't it? The sample code indeed looking at the size of a float. It's just the text that could be wrong. Comments?
matthew.scarpino (9) [Avatar] Offline
#3
Re: Errors and Corrections
Hello,

You're quite right. The 'ints' should be 'floats.' I'll get that corrected immediately.

Thank you!
Matt
andaluri (4) [Avatar] Offline
#4
Re: Errors and Corrections
The source code listings on linux were packaged with Windows line endings. At least the first upload of source code. Just want to share this with fellow readers: You may need to use dos2unix to convert the line endings for *nix.
matthew.scarpino (9) [Avatar] Offline
#5
Re: Errors and Corrections
Hello Andaluri,

Thank you for mentioning that. I'll make sure that every future GNU source release has UNIX-style line endings.

Thanks again,
Matt
andaluri (4) [Avatar] Offline
#6
Re: Errors and Corrections
On Page 174, the text reads "For example, if a main buffer contains 200 floats and you want to create a sub-buffer containing floats 70-99, you could use the following function:"

Should the text in bold above i.e. "70-99" should read "70-89" as the code right below the text is creating 20 float region?
matthew.scarpino (9) [Avatar] Offline
#7
Re: Errors and Corrections
Hello again andaluri,

You're absolutely right. "70-89" it is. Thank you for paying such close attention!

Best regards,
Matt
mato (44) [Avatar] Offline
#8
Re: Errors and Corrections
Thanks! That correction will be in the next MEAP release.

Maria Townsley
Developmental Editor
Manning Publications Co.
hazelnusse (2) [Avatar] Offline
#9
Re: Errors and Corrections
On page 21, in the listing for platform_ext_test.c, it seems like:

err = clGetPlatformIDS(1, NULL, &num_platforms);

will always result in num_platforms being either 0 or 1, so the subsequent for loop is a bit redundant. Shouldn't the first argument be the maximum number of platforms you want to get, so something larger than 1?

This is also in the GNU source code.
hazelnusse (2) [Avatar] Offline
#10
Re: Errors and Corrections
page 65 has:

work_items_per_kernel = 4;

but the source code has:

work_units_per_kernel = 4; /* 4 work-units per kernel */

The source code should probably be changed to match the text to eliminate confusion.
jb4652 (9) [Avatar] Offline
#11
Re: Errors and Corrections
On page 51, Section 3.3.2
"The data type of the fourth argument is cl_image_info."
I guess the fourth should be the second.

Another code issue (ch3 buffer check):
by running it on both Intel SDK and Nvidia SDK, I got "CL_INVALID_VALUE" error code after calling clCreateSubBuffer().
If I change it from
sub_buffer = clCreateSubBuffer(main_buffer, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, CL_BUFFER_CREATE_TYPE_REGION, ®ion, &err);
to
sub_buffer = clCreateSubBuffer(main_buffer, CL_MEM_READ_ONLY, CL_BUFFER_CREATE_TYPE_REGION, ®ion, &err);
Nvidia SDK will work. Intel SDK still gets CL_DEVICE_MEM_BASE_ADDR_ALIGN error code, but this is beyond the topic of this section.
OpenCL 1.1 specification also states that on page 61:
CL_INVALID_VALUE if buffer was created with CL_MEM_WRITE_ONLY and flags specifies CL_MEM_READ_WRITE or CL_MEM_READ_ONLY, or if buffer was created with CL_MEM_READ_ONLY and flags specifies CL_MEM_READ_WRITE or CL_MEM_WRITE_ONLY, or if flags specifies CL_MEM_USE_HOST_PTR or CL_MEM_ALLOC_HOST_PTR or CL_MEM_COPY_HOST_PTR.

Thanks.
jb4652 (9) [Avatar] Offline
#12
Re: Errors and Corrections
Windows version code, ch4_double_test.c
Line 133-134:
clGetDeviceInfo(device, CL_DEVICE_EXTENSIONS, sizeof(ext_data), NULL, &ext_size);
should be
clGetDeviceInfo(device, CL_DEVICE_EXTENSIONS, 0, NULL, &ext_size);

Just as the one used in Listing 2.1, or ext_size will be incorrect.
jb4652 (9) [Avatar] Offline
#13
Re: Errors and Corrections
Page 84, Fig 4.5
uint4 vec = (vec4)(0x00010203, 0x04050607, 0x08090A0B, 0x0C0D0E0F);
should be consist with Listing 4.4
uint4 vec = (global uint4)(0x00010203, 0x04050607, 0x08090A0B, 0x0C0D0E0F);
and (vec4) is itself a gramma error.

The blue bar of Listing 4.4 on top of page 85 should be on page 84, above:
__kernel void vector_bytes(__global uchar16 *test) {
jb4652 (9) [Avatar] Offline
#14
Re: Errors and Corrections
Code: Ch10_reduction: reduction.c

vector_sum_buffer = clCreateBuffer(context, CL_MEM_READ_WRITE | CL_MEM_COPY_HOST_PTR, num_groups * sizeof(float), vector_sum, &err);
shoule be
vector_sum_buffer = clCreateBuffer(context, CL_MEM_READ_WRITE | CL_MEM_COPY_HOST_PTR, num_groups / 4 * sizeof(float), vector_sum, &err);

However, the result is not affected.
phaethon (1) [Avatar] Offline
#15
Re: Errors and Corrections
On page 218 (the pyOpenCL section), a line reads:

kernel.enqueue_nd_range_kernel(queue, kernel, (20,), (4,))

And should actually be:

cl.enqueue_nd_range_kernel(queue, kernel, (20,), (4,))
Worblehat (1) [Avatar] Offline
#16
Re: Errors and Corrections
In Chapter 2 device_ext_text.c I needed to change to things, to make it work on my system:

In line 30 the num_entries parameter in clGetDeviceIDs must be 0. Otherwise I get an CL_INVALID_VALUE error and num_devices is 0. In fact this does not happen with all platforms. With my Geforce GPU is correctly detected, but when my platform is amd app sdk with a intel core cpu, the error is returned.

I also get a segfault in lines 51/52. It should be sizeof(addr_data) instead of sizeof(ext_data).
jknapka (1) [Avatar] Offline
#17
Re: Errors and Corrections
In the Safari Library edition, Listing 2.4 (Creating a program from a text file), the call to clCreateProgramFromSource() passes incorrect values for the third and fourth parameters (program_buffer and program_size). Both of those arguments should be preceded by the address operator "&", since the respective argument types are "array of const char*" and "array of size_t". Unfortunately, due to the third argument's cast to (const char**), only the type mismatch in the fourth argument provokes a warning from GCC, and when run, the example segfaults even if one fixes that warning. (I'm using the NVIDIA Linux CUDA 5.5 implementation of OpenCL on Ubuntu 13.10.)
366182 (1) [Avatar] Offline
#18
Global Mem Fence in Listing 11.1
I have difficulties with understanding the global mem fence in listing 11.1 on page 243 and meanwhile believe that it is too strong. I bring this up here because in the text below the figure the importance of exact understanding of memory barriers is again emphasized.
The barrier itself is clearly required: the code before the barrier has to finsh executing on all work items before work item with local id = 0 can accumulate the results.
However, work item with id = 0 will
(a) accumulate the results on global memory using atomic_add which should provide a memory synchronisation anyway and
(b) moreover, no work item write to global memory before the barrier.

So, why is the mem-fence on global memory required?
It would suggest some sort of synchronisation requirement between work-groups but in view of the atomic_adds I also don't see this. (Also cf. section 7.4.3 on page 163-164)

I urgently seek for an explanation (which should be added in the book also).