|
-
December 14th, 2009, 11:49 AM
#1
Multi Dimensional Array speed optimization
I'm coding for an application that has a function like this:
Code:
decimal totalValue;
for(int i=0; i<xArrayLength; i++)
{
for(int j=0; j<yArrayLength;j++)
{
totalValue += currentValue[i,j] //currentValue is modular
}
}
return totalValue;
xArrayLength and yArrayLength aren't large, but the function can get called millions of times. I've noticed a significant amount of time in my program is spent fetching values from two dimensional arrays like this.
My question: Does swapping the looping order of i and j change the speed of fetching values from the array? Would it affect the speed if the arraylengths were bigger? Is there anything else I can do to speed up the constant access to multi dimensional arrays in scenarios like this?
-
December 14th, 2009, 11:59 AM
#2
Re: Multi Dimensional Array speed optimization
First thing that comes to my mind is multi-threading. This will allow your application to continue running in the foreground while your processor intensive operations complete on a different thread in the background.
-
December 14th, 2009, 01:03 PM
#3
Re: Multi Dimensional Array speed optimization
Jagged array access is faster than multidimensional array access in C#. If lookup speed is really the bottleneck it is something to try.
-
December 14th, 2009, 01:06 PM
#4
Re: Multi Dimensional Array speed optimization
if the second array has a constant length it would be better to use just one array like for example a 4x4 matrix stored as 1234123412341234 instead of two arrays.
win7 x86, VS 2008 & 2010, C++/CLI, C#, .NET 3.5 & 4.0, VB.NET, VBA... WPF is comming
remeber to give feedback  you think my response deserves recognition? perhaps you may want to click the Rate this post link/button and add to my reputation
private lessons are not an option so please don't ask for help in private, I won't replay
if you use Opera and you'd like to have the tab-button functionality for the texteditor take a look at my Opera Tab-UserScirpt; and if you know how to stop firefox from jumping to the next control when you hit tab let me know
-
December 14th, 2009, 02:39 PM
#5
Re: Multi Dimensional Array speed optimization
Thanks for suggestions.
Both arrays are constant length. Someone else is already looking into trying to multi-thread the application, but that might not be possible. It's an enormous application that doesn't exactly have one bottleneck, but according to my profiler a very large chunk of time is spent in various places fetching from multi dimensional arrays.
To test the jagged array and single array suggestions I ran this code through a profiler.
Code:
decimal[,] lTestArray = new decimal[20, 20];
decimal[][] lTestArray2 = new decimal[20][];
decimal[] lTestArray3 = new decimal[20 * 20];
decimal lTemp,lTemp2,lTemp3,lTemp4;
for (int i = 0; i < 20; i++)
{
lTestArray2[i] = new decimal[20];
for(int j=0;j<20;j++) //just assigning arbitrary values
{
lTestArray[i,j]=j*5.3m*i;
lTestArray2[i][j] = j*5.3m*i;
lTestArray3[i * 20 + j] = j * 5.3m * i;
}
}
for (int i = 0; i < 500000; i++) // run enough times to make profiler results more accurate
for (int j = 0; j < 20; j++)
for (int k = 0; k < 20; k++)
{
lTemp = lTestArray[j, k];
lTemp2 = lTestArray2[j][k];
lTemp3 = lTestArray3[j * 20 + k];
lTemp4 = lTestArray3[17];
}
Results (%time spent on each line):
lTemp = lTestArray[j, k]; - 14.5%
lTemp2 = lTestArray2[j][k]; - 16.9%
lTemp3 = lTestArray3[j * 20 + k]; - 21.8%
lTemp4 = lTestArray3[17]; - 12.1%
Rest of time was spent on the for loops/launching test. I'm a little surprised jagged arrays were slower than the multi dimensional array. I googled them and other articles seemed to confirm they're faster. The single dimensional array was (slightly) faster, but only if I don't have to calculate which index to use. I guess in this example I could have a single for loop, but in other places in the application I need to access values holding 1 array index constant and grabbing all the values in the other.
Is this test legitimate? It seems to imply I'm better off keeping the arrays the way they are which I didn't expect. I ran a separate test which showed zero speed difference if I switch the index order for multi dimensional arrays. I thought maybe that would cause values closer in memory to be fetched sequentially which would speed things up.
-
December 14th, 2009, 02:49 PM
#6
Re: Multi Dimensional Array speed optimization
have you run this tests in debug or realase mode?
i did a quick&dirty test (in realase mode)
Code:
System.Diagnostics.Stopwatch sw2 = new System.Diagnostics.Stopwatch();
sw2.Start();
for (int i = 0; i < 500000; i++) // run enough times to make profiler results more accurate
{
for (int j = 0; j < 20; j++)
{
for (int k = 0; k < 20; k++)
{
//lTemp = lTestArray[j, k];
//lTemp2 = lTestArray2[j][k];
//lTemp3 = lTestArray3[j * 20 + k];
}
}
}
sw2.Stop();
Debug.WriteLine(sw2.ElapsedMilliseconds);
Console.WriteLine(sw2.ElapsedMilliseconds);
and i run two tests (i'm lazy)
the times in miliseconds were:
for lTemp = lTestArray[j, k]; 2964 and 3103
for lTemp2 = lTestArray2[j][k]; 2893 and 2810
for lTemp3 = lTestArray3[j * 20 + k]; 1537 and 1455
i uncommented only one of those for each test run
Last edited by memeloo; December 14th, 2009 at 03:23 PM.
win7 x86, VS 2008 & 2010, C++/CLI, C#, .NET 3.5 & 4.0, VB.NET, VBA... WPF is comming
remeber to give feedback  you think my response deserves recognition? perhaps you may want to click the Rate this post link/button and add to my reputation
private lessons are not an option so please don't ask for help in private, I won't replay
if you use Opera and you'd like to have the tab-button functionality for the texteditor take a look at my Opera Tab-UserScirpt; and if you know how to stop firefox from jumping to the next control when you hit tab let me know
-
December 14th, 2009, 03:19 PM
#7
Re: Multi Dimensional Array speed optimization
If that's the exact code you put through your profiler then I wouldn't trust those results. Per line profilers are notoriously inaccurate as the overhead of capturing the profiling data can be 10's of time more than the computation the line actually performed. What you'd really need to do is some manual micro-profiling such as running each of those 4 varients one by one and time how long each takes manually:
Code:
decimal[,] lTestArray = new decimal[20, 20];
decimal[][] lTestArray2 = new decimal[20][];
decimal[] lTestArray3 = new decimal[20 * 20];
decimal lTemp,lTemp2,lTemp3,lTemp4;
for (int i = 0; i < 20; i++)
{
lTestArray2[i] = new decimal[20];
for(int j=0;j<20;j++) //just assigning arbitrary values
{
lTestArray[i,j]=j*5.3m*i;
lTestArray2[i][j] = j*5.3m*i;
lTestArray3[i * 20 + j] = j * 5.3m * i;
}
}
DateTime start = DateTime.Now;
for (int i = 0; i < 500000; i++) // run enough times to make profiler results more accurate
for (int j = 0; j < 20; j++)
for (int k = 0; k < 20; k++)
lTemp = lTestArray[j, k];
TimeSpan totalTime = DateTime.Now - start;
Console.WriteLine ("Multi-Dim took {0}ms", totalTime.TotalMilliseconds);
Of course, the old rule still holds true: "If you can't speed it up - just don't do it". How often does the data in the array change? Would it be more performant to keep a cached value which you keep updated whenever the array changes? Would that complicate your design too much? Speed is great until you start compromising maintainability. Who cares if you can get it 10% faster if it makes it impossible to maintain.
EDIT: As per last post, make sure you're running in Release mode - not debug mode.
www.monotorrent.com For all your .NET bittorrent needs
NOTE: My code snippets are just snippets. They demonstrate an idea which can be adapted by you to solve your problem. They are not 100% complete and fully functional solutions equipped with error handling.
-
December 15th, 2009, 08:15 AM
#8
Re: Multi Dimensional Array speed optimization
Let me hook in here because I have a very similar question.
For some image processing algorithms it is comfortable to have something like
Code:
byte[,] img2 = new byte[480,640]
.
For calculations which simply iterate through all elements, as in the example of JMulligan,
it is simpler, and perhaps faster, to have
Code:
byte[] img1 = new byte[480*640]
.
Now, to avoid working with 2 copies of the same data, and considering that both variables are references,
isn't there a simple (and probably dirty, i.e. unsafe) way to direct both references to the same instance?
-
December 15th, 2009, 01:34 PM
#9
Re: Multi Dimensional Array speed optimization
 Originally Posted by hreba
Let me hook in here because I have a very similar question.
For some image processing algorithms it is comfortable to have something like
Code:
byte[,] img2 = new byte[480,640]
.
For calculations which simply iterate through all elements, as in the example of JMulligan,
it is simpler, and perhaps faster, to have
Code:
byte[] img1 = new byte[480*640]
.
Now, to avoid working with 2 copies of the same data, and considering that both variables are references,
isn't there a simple (and probably dirty, i.e. unsafe) way to direct both references to the same instance?
Not that I know of; multi-dimensional arrays and jagged arrays are different beasts altogether. I always use unsafe byte* when doing any serious image manipulations, so I have never tried what you have proposed.
-
December 16th, 2009, 02:26 PM
#10
Re: Multi Dimensional Array speed optimization
I ran a few tests in release mode both with the profiler, and the stop watch and they do seem to show that using a single array will be around 25-50% faster, so I'll try that in my actual program.
I'm wondering what the best way to test something like this truly is though. It's seems no matter what you'll have upkeep that screws up the results a little bit.
-If you use a profiler line-by line it slows everything down which corrupts results.
-If you use memeloo's stopwatch method than you're also calculating the speed of the for loops. Also I find results can be significantly different every time you run it.
-I also tried making functions where all they do is assign a value from one of the arrays and then running a profiler on method level. I don't know if the time spent going in and out of the functions corrupts things though.
Either way I found out what I wanted to know about arrays so thank you guys.
-
December 16th, 2009, 05:04 PM
#11
Re: Multi Dimensional Array speed optimization
There are a lot of reason that kind of double array loop can be slow, but there isn't enough information here to really point anything out. However, some ideas of possible problems.
1. Dealing with the CPU data cache. Caches tend fetch rows of data. If you need some information at memory address 0x1000, the cache may fetch all data from 0x1000 to 0x10FF.
What this means is that if the next piece of data you want is at 0x1001, the data is already in the cache and the CPU does not need to fetch the data from main memory because it is slow.
How this affects you: Imagine you have a 255 x 255 array of bytes. You have a double loop, X and Y, Y being the inner loop. If you iterate such MyArray[X][Y] then the next address for each loop is the next byte in the cache, IE 0x1001, 0x1002, etc...
However, if you iterate MyArray[Y][X], each loop requires the data cache to fetch from main memory. As the next address needed after 0x1000 is 0x1100, which is not in the cache. The address after that is 0x1200, etc...
You seem to do it in the optimal order, just keep that in mind.
2. Be aware of Data sizes. Assuming you are using a 32 bit app, if you are fetching bytes, there is a lot of data conversion going on. The CPU fetches 32 bits at once, even though you only need 1 byte. Behind the scenes, the system fetches 4 bytes, and then converts it to a single bytes. You may also be adding data to an 4 byte int, which causes another data conversion.
Similar problems happen with large data types that are not aligned with the byte size of the CPU.
You would notice a big speed gain if you stored your byte values as 4 byte ints. This would take up a lot more memory, but you would gain that back in terms of speed.
3. Caching your answers. If you know you are never going to repeat a computation then Caching will not help you. However, if you know there are certain sets of arrays that are computed often, you can cache the answer. No need to add them up more then once.
Furthermore, if there are sections of your data set that are subsets of larger data sets, you may benefit from using a smart caching system prevent re-computing the smaller sub-sets over and over.
4. Threading: This will only help you *IF* you limit the number of threads to the number of CPU cores available. It can actually make things much worse to "parallelize" a computation over more threads then there are CPU cores. It also will not help you unless you can make sure each thread is run on a different core.
Why? Remember point 1? If different threads are on different 'spots' in the array that do not fit into the data cache, each time a thread runs, its data needs to be fetched from main memory. Different cores have their own cache, but if the threads run on the same core, there is just a single cache involved.
Lets take the example of 10 threads on one core vs 1 thread on one core. Lets say that 1000 additions can be done in an applications time slice. A single thread adds 1000 elements and is done. 10 threads, in theory add 100 elements each. However they do not because when thread #1 is done, and before thread #2 comes up, the data has to be fetched from main memory into the cache. With 10 threads you have 10 data cache fetches. With 1 thread, you just have 1 data cache fetch.
Overall, this kind of optimization should not be a concern to a programmer, but if you run into the territory where the time to compute some answer becomes a concern, then you should be aware of these things.
-
December 16th, 2009, 05:09 PM
#12
Re: Multi Dimensional Array speed optimization
For dome reason I can't hand you any rep DeepT (button does nothing), but nice explanation of locality of data and the rest.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|