The downloadable file here is a tutorial on writing OpenCL programs using C.
The compiler used is clang and the OpenCL environment is pocl. The pocl environment as downloaded from portable.org is targeted at CPU cores as opposed to GPUs. Consideration of CPU usage is given with respect to Intel based Mac Pro computers.
The tutorial starts by processing multiple columns of data as a one dimensional OpenCL program. Consideration is then given of how pocl maps kernel work-items and work-groups to CPU cores together with the limitation which then follows. Modification of a kernel to exploit hardware elements is shown. Then two dimensional OpenCL kernels are explored. The information made available to the kernel by OpenCL functions is the developed. Those developments are then applied to standard matrix multiplication, and then using tiles.
Listings of the kernels and the host programs are given. The kernels are written using the string format which is embodied in the host program code.
Revised document -- August 2023