Click to See Complete Forum and Search --> : OpenCL reduction :how does this code work?


wl3
January 30th, 2012, 05:56 PM
I had been trying to make a reduction kernel to sum the contents of a very large array. I asked this question on stackoverflow, but still don't fully understand parts of the answer. For starters, what is meant by the last suggestion by Grizzly. I the below example, a step reduction, what is meant by stride for example? Do I call this with a global size smaller than the amount of the items in the array, and it will reduce the array to a new array with an amount of items equal to the global work size?


__kernel void reduction_step(__global const unsigned long* A, __global unsigned long * C, uint size) {
unsigned long sum=0;
for(int i=start; i < size; i += stride)
sum += A[i];
C[get_global_id(0)]= sum;
}


http://stackoverflow.com/questions/8864786/opencl-reduction-examples-and-retaining-memory-objects-converting-cuda-code