-
March 11th, 2011, 06:10 PM
#1
performance: memory alignment vs. bit manipulation
We have a couple of brute force algorithms with excessive memory access. The data the algs work on are contained in 3 2D arrays. Not all algorithms use all three arrays, most only use 1 of them. Data is never written, only read. no x64 consideration.
Arrays are at least couple of megabytes. Every redundant aspect is ommited here except types. So plz do not tell me "use std containers", concentrate on the issue at hand =).
Question is about performace, the faster the algorithms are done the better.
Code:
typedef unsigned char u8;
typedef unsigned short u16;
typedef unsigned long u32;
// what is faster
u16 sourceData1[][];
u8 sourceData2[][];
u8 sourceData3[][];
// or
u32 sourceData[][]; // all 3 arrays put together into one
u16 inline getData1(int x, int y)
{
return sourceData[y][x] & 0xFFFF;
}
u8 inline getData2(int x, int y)
{
return (sourceData[y][x]>>16) & 0xFF;
}
u8 inline getData3(int x, int y)
{
return sourceData[y][x]>>24;
}
The argument why the latter could be faster even though the algorithm probably only uses one part of the data is: it's aligned to 32 bits. Memory access could be faster than the bit manipulation overhead. Anyone got experience with this or a useful link.. anything.
Also: sourceData3 is actually a boolean. If I compress sourceData3 into a flag-array, where each u32 in that array would contain 32 booleans, the size of the array would be 8 times less in bytes, but bit manipulation is required to read. Would this be a peanalty or benefit?
Thank you.
Last edited by Teabix; March 11th, 2011 at 06:25 PM.
- If you know what you want then you do not know yourself good enough.
-
March 11th, 2011, 10:13 PM
#2
Re: performance: memory alignment vs. bit manipulation
Originally Posted by Teabix
If I compress sourceData3 into a flag-array, where each u32 in that array would contain 32 booleans, the size of the array would be 8 times less in bytes, but bit manipulation is required to read. Would this be a peanalty or benefit?
Why not just write a small app and test this?
Regards,
Paul McKenzie
Last edited by Paul McKenzie; March 11th, 2011 at 10:35 PM.
-
March 12th, 2011, 03:52 AM
#3
Re: performance: memory alignment vs. bit manipulation
Why not just write a small app and test this?
I can test it on 1 machine. I thought there could be some article or heresay that "it is generally presumed, intel and AMD, all generations, that A is faster than B"
If I wrote a small test code here, would you guys be willing to compile&run it on your machines?
- If you know what you want then you do not know yourself good enough.
-
March 12th, 2011, 12:02 PM
#4
Re: performance: memory alignment vs. bit manipulation
Originally Posted by Teabix
If I wrote a small test code here, would you guys be willing to compile&run it on your machines?
Sure, no problem.
Regards,
Paul McKenzie
-
March 12th, 2011, 04:11 PM
#5
Re: performance: memory alignment vs. bit manipulation
I agree, a compileable source that obviosly does no harm is most likely to be runned by a lot of members
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|