-
November 21st, 2005, 05:37 PM
#1
How to optimize this function ?
Hi, ppl.
I want to optimize the wave function in the main code. I tried SSE instructions but the code become slower. I'm compiling using gcc -O3 -march=i686 -mcpu=i686 -malign-double -funroll-loops -fomit-frame-pointer -o float_conv float_conv.c -lm
Can anybody help me?
Thanx.
#include <stdio.h>
#include <math.h>
#include <unistd.h>
#include <sys/time.h>
float points[] = {0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1};
float y[] = {3, 3.38382, 3.14269, 2.47603, 1.59766, 0.698843, -0.0755374, -0.637482, -0.954788, -1.04158, -0.944139};
float xx[4] = {1.02938, 7.39821, 5.98217, 3.8127};
float vt;
int main(void){
struct timeval ts,te;
int i,j,k = 500000;
int nx=11;
float somaN = 0;
float val;
printf("Calculation...\n");
gettimeofday(&ts,NULL);
for (j = 0; j < k; j++)
{
for (i=0;i<=nx;i++)
{
//WAVE FUNCTION
val = exp(-xx[2]*points[i]/2.0)*(xx[0]*cos(xx[3]*points[i])+ xx[1]*sin(xx[3]*points[i]));
somaN = somaN+(y[i]-val)*(y[i]-val);
}
val = sqrt(somaN)/(nx+1);
}
gettimeofday(&te,NULL);
printf("Time: %f\n",(double)(te.tv_sec-ts.tv_sec)+0.000001*(double)(te.tv_usec-ts.tv_usec));
return 0;
}
-
November 21st, 2005, 07:31 PM
#2
Re: How to optimize this function ?
Have you tried working with double instead of float? I'm not exactly sure that exp, sin, cos and the like have a float overload, so there may be lots of unneccessary conversions.
By the way, you have a bug with your bounds on i. The y[] array has 11 values, yet you loop from 0 to 11 inclusive. y[11] doesn't exist, so you're accessing one element past the end of the array. The loop should be
for (i = 0; i < nx; i++)
Get this small utility to do basic syntax highlighting in vBulletin forums (like Codeguru) easily.
Supports C++ and VB out of the box, but can be configured for other languages.
-
November 22nd, 2005, 04:21 AM
#3
Re: How to optimize this function ?
Run it with a profiler, and find out where the time is going. Then attack the code that's taking most time. Any other approach is flailing in the dark.
Correct is better than fast. Simple is better than complex. Clear is better than cute. Safe is better than insecure.
-- Sutter and Alexandrescu, C++ Coding Standards
Programs must be written for people to read, and only incidentally for machines to execute.
-- Harold Abelson and Gerald Jay Sussman
The cheapest, fastest and most reliable components of a computer system are those that aren't there.
-- Gordon Bell
-
November 22nd, 2005, 07:17 AM
#4
Re: How to optimize this function ?
Well, Graham in this particular case, there is not much point running it in a profiler. What else is it going to say than that the line doing the calculations is taking the most time?
val = exp(-xx[2]*points[i]/2.0)*(xx[0]*cos(xx[3]*points[i])+ xx[1]*sin(xx[3]*points[i]));
Get this small utility to do basic syntax highlighting in vBulletin forums (like Codeguru) easily.
Supports C++ and VB out of the box, but can be configured for other languages.
-
November 22nd, 2005, 07:31 AM
#5
Re: How to optimize this function ?
If you can sacrifice some accuracy, you could use a lookup table for the cosine and sine values, then use something simple like linear interpolation between the points. This will be much faster than actually numerically approximating the functions.
-
November 22nd, 2005, 08:09 AM
#6
Re: How to optimize this function ?
The following line is computed k * nx times:
val = exp(-xx[2]*points[ i ]/2.0)*(xx[0]*cos(xx[3]*points[ i ])+ xx[1]*sin(xx[3]*points[ i ]));
k is very big. But since that line does not depend on k, only on i which is small, then it would be better to place that line outside of the loop on k. This can be done if you have a table of val[] for all the values of i, and in the inner loop, you'll use the precomputed val[ i ] instead of val.
-
November 22nd, 2005, 09:54 AM
#7
Re: How to optimize this function ?
You can change the exp part calculating once
double res = exp(-xx[2]*0.1/2.0);
so exp(-xx[2]*points[i]/2.0) = intpow(res, i);
Since pow is for double exponent (very slow for integral exponent) you define your own pow function(recursive or whatever)
and MAYBE (i havent check it out) for sin and cos since
cos(xx[3]*points[i]) = cos(xx[3]*0.1*i) = cos(A*i) you can use the :
sin(nA) = sinA*{ (2cosA)^(n-1) - ( n - 2 | 1) * (2cosA)^(n-3) + ( n - 3 | 2 ) *(2cosA) ^(n-5) - ...}
and
cos(nA) = 1/2*{(2cosA)^n - n/1 * (2cosA)^(n-2) + n/2 * ( n - 3 | 1) * (2cosA)^(n-4) - n/3 * ( n - 4 | 2 ) * (2cosA)^(n-6) + ...}
Last edited by bilm_ks; November 22nd, 2005 at 10:42 AM.
-
November 22nd, 2005, 10:14 AM
#8
Re: How to optimize this function ?
You do xx[3]*points[i] twice on this calculation, try creating a variable and save this 2nd calculation. You can also put these 2 fors variables in a register(for (register inti=0 ; ...) , if the compiler can put them in a register this will speed up these for loop. Also, Bob Davis gave a good hint, instead of use an array of angles if you can use an array of sin and cos of these angles (if they will not change) you will save the sin and cos calculations. You will not lose acurancy if you get them from a good calculator, the windows calculator for exemple.
-
November 22nd, 2005, 10:48 AM
#9
Re: How to optimize this function ?
Rabelo the "hard" part in this code is the exp, cos and sin functions, not adds and muls
-
November 22nd, 2005, 12:17 PM
#10
Re: How to optimize this function ?
Hi Yves.
Originally Posted by Yves M
Have you tried working with double instead of float? I'm not exactly sure that exp, sin, cos and the like have a float overload, so there may be lots of unneccessary conversions.
I tried this, with no success.
Originally Posted by Yves M
By the way, you have a bug with your bounds on i. The y[] array has 11 values, yet you loop from 0 to 11 inclusive. y[11] doesn't exist, so you're accessing one element past the end of the array. The loop should be
for (i = 0; i < nx; i++)
You're right. I put here a slightly modified version of my code. But the correct version remains slow.
Thanx.
-
November 22nd, 2005, 12:19 PM
#11
Re: How to optimize this function ?
Hello Bob.
Unfortunatelly, I can't use a table.
Originally Posted by Bob Davis
If you can sacrifice some accuracy, you could use a lookup table for the cosine and sine values, then use something simple like linear interpolation between the points. This will be much faster than actually numerically approximating the functions.
-
November 22nd, 2005, 12:22 PM
#12
Re: How to optimize this function ?
Hi olivthill.
The line must be inside the loop. This code is only to test the evaluation function of a genetic algorithm. Using a profiler I detected that this function is the most called in my program. So, I want to make it faster.
Originally Posted by olivthill
The following line is computed k * nx times:
k is very big. But since that line does not depend on k, only on i which is small, then it would be better to place that line outside of the loop on k. This can be done if you have a table of val[] for all the values of i, and in the inner loop, you'll use the precomputed val[ i ] instead of val.
-
November 22nd, 2005, 12:25 PM
#13
Re: How to optimize this function ?
Sorry, bilm_ks.
I forgot to say that xx and y values change a lot of times during the process. The code I put here is only a small part of my program.
Originally Posted by bilm_ks
You can change the exp part calculating once
double res = exp(-xx[2]*0.1/2.0);
so exp(-xx[2]*points[i]/2.0) = intpow(res, i);
Since pow is for double exponent (very slow for integral exponent) you define your own pow function(recursive or whatever)
and MAYBE (i havent check it out) for sin and cos since
cos(xx[3]*points[i]) = cos(xx[3]*0.1*i) = cos(A*i) you can use the :
sin(nA) = sinA*{ (2cosA)^(n-1) - ( n - 2 | 1) * (2cosA)^(n-3) + ( n - 3 | 2 ) *(2cosA) ^(n-5) - ...}
and
cos(nA) = 1/2*{(2cosA)^n - n/1 * (2cosA)^(n-2) + n/2 * ( n - 3 | 1) * (2cosA)^(n-4) - n/3 * ( n - 4 | 2 ) * (2cosA)^(n-6) + ...}
-
November 22nd, 2005, 12:28 PM
#14
Re: How to optimize this function ?
I think you're right. That's why I tried SSE intrinsic to do the job, but the codes I found only does add, sub, mul and div, not trigs.
Originally Posted by bilm_ks
Rabelo the "hard" part in this code is the exp, cos and sin functions, not adds and muls
-
November 23rd, 2005, 01:50 PM
#15
Re: How to optimize this function ?
Nobody else has another clue ?
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|