|
-
March 11th, 2011, 06:10 PM
#1
performance: memory alignment vs. bit manipulation
We have a couple of brute force algorithms with excessive memory access. The data the algs work on are contained in 3 2D arrays. Not all algorithms use all three arrays, most only use 1 of them. Data is never written, only read. no x64 consideration.
Arrays are at least couple of megabytes. Every redundant aspect is ommited here except types. So plz do not tell me "use std containers", concentrate on the issue at hand =).
Question is about performace, the faster the algorithms are done the better.
Code:
typedef unsigned char u8;
typedef unsigned short u16;
typedef unsigned long u32;
// what is faster
u16 sourceData1[][];
u8 sourceData2[][];
u8 sourceData3[][];
// or
u32 sourceData[][]; // all 3 arrays put together into one
u16 inline getData1(int x, int y)
{
return sourceData[y][x] & 0xFFFF;
}
u8 inline getData2(int x, int y)
{
return (sourceData[y][x]>>16) & 0xFF;
}
u8 inline getData3(int x, int y)
{
return sourceData[y][x]>>24;
}
The argument why the latter could be faster even though the algorithm probably only uses one part of the data is: it's aligned to 32 bits. Memory access could be faster than the bit manipulation overhead. Anyone got experience with this or a useful link.. anything.
Also: sourceData3 is actually a boolean. If I compress sourceData3 into a flag-array, where each u32 in that array would contain 32 booleans, the size of the array would be 8 times less in bytes, but bit manipulation is required to read. Would this be a peanalty or benefit?
Thank you.
Last edited by Teabix; March 11th, 2011 at 06:25 PM.
- If you know what you want then you do not know yourself good enough.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|