-
January 30th, 2012, 06:56 PM
#1
OpenCL reduction :how does this code work?
I had been trying to make a reduction kernel to sum the contents of a very large array. I asked this question on stackoverflow, but still don't fully understand parts of the answer. For starters, what is meant by the last suggestion by Grizzly. I the below example, a step reduction, what is meant by stride for example? Do I call this with a global size smaller than the amount of the items in the array, and it will reduce the array to a new array with an amount of items equal to the global work size?
Code:
__kernel void reduction_step(__global const unsigned long* A, __global unsigned long * C, uint size) {
unsigned long sum=0;
for(int i=start; i < size; i += stride)
sum += A[i];
C[get_global_id(0)]= sum;
}
http://stackoverflow.com/questions/8...ting-cuda-code
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|