-
May 10th, 2012, 12:52 PM
#1
Optimising Code
I'm new to C++, and as an exercise I created a program which calculates pi to n decimal places.
It works, but I want to make it go faster, at the minute it calculates 10,000 digits in around 13 seconds.
I analysed the code, and the two lines taking up the most CPU time are:
Code:
remainder[i] = sum[i] % B[i];
carried[i - 1] = ((sum[i] - remainder[i]) / B[i]) * i;
Each array (remainder, sum, B and carried) contains approximately 15000 elements (more if calculating to more decimal places), each element is a single unsigned integer.
Is there a more efficient way of coding those two lines, or is it just my CPU which is limiting speed?
Thank You,
-Sam
-
May 10th, 2012, 01:05 PM
#2
Re: Optimising Code
Originally Posted by SamstaUK
I'm new to C++, and as an exercise I created a program which calculates pi to n decimal places.
It works, but I want to make it go faster, at the minute it calculates 10,000 digits in around 13 seconds.
You didn't mention what compiler you used, whether you're running an optimized build, etc.
Is there a more efficient way of coding those two lines,
We need to look at more than those two lines. For example, you redundantly have sum[i] and B[i] specified. Is this the only places where these arrays are used? If not, you could have just assigned sum[i] and B[i] to variables and just use those variables instead of repeatedly using sum[i] and B[i].
Also, look at the generated assembly code for those two lines. That will give you an idea of whether there is a way to optimize those lines further.
Regards,
Paul McKenzie
-
May 10th, 2012, 02:26 PM
#3
Re: Optimising Code
Originally Posted by SamstaUK
I analysed the code, and the two lines taking up the most CPU time are:
Code:
remainder[i] = sum[i] % B[i];
carried[i - 1] = ((sum[i] - remainder[i]) / B[i]) * i;
Code:
((sum[i] - remainder[i]) / B[i])
is the same as
You can eliminate that subtraction and division by using div() function that calculates both quotient and remainder at the same time:
Code:
div_t ans = div(sum[i], B[i]);
remainder[i] = ans.rem;
carried[i - 1] = ans.quot * i;
But I agree with Paul, you should post your benchmark to get more optimization suggestions.
Vlad - MS MVP [2007 - 2012] - www.FeinSoftware.com
Convenience and productivity tools for Microsoft Visual Studio:
FeinWindows - replacement windows manager for Visual Studio, and more...
-
May 11th, 2012, 03:20 PM
#4
Re: Optimising Code
Okay I ran the program 5 times, here is the average time for calculating pi to 10000 decimal places:
Run 1: 13.744
Run 2: 13.806
Run 3: 13.526
Run 4: 13.838
Run 5: 13.791
Avg: 13.741 seconds
Here is the code, a word of warning though, it will make your eyes bleed, if you have the temptation to reach through your monitor and *****-slap me through the other end I understand :P
#include <iostream>
#include <vector>
#include <math.h>
#include <time.h>
using namespace std;
void main(){
vector<int> B;
vector<int> remainder;
vector<int> m10;
vector<int> sum;
vector<int> carried;
vector<int> predigits;
double digits;
double size;
int predigit;
int counter = 3;
int i;
int x;
B.push_back(0);
cout << "Number of digits of pi to calculate: ";
cin >> digits;
clock_t start = clock();
size = digits / log10(double(2));
for(i = 0; i <= int(size); i++){
remainder.push_back(2);
B.push_back(counter);
carried.push_back(0);
m10.push_back(0);
sum.push_back(0);
counter += 2;
}
for(x = 1; x <= (digits + 1); x++){
for(i = int(size); i >= 1; i--){
m10[i] = remainder[i] * 10;
sum[i] = m10[i] + carried[i];
remainder[i] = sum[i] % B[i];
carried[i - 1] = ((sum[i] - remainder[i]) / B[i]) * i;
}
m10[0] = remainder[0] * 10;
sum[0] = m10[0] + carried[0];
remainder[0] = sum[0] % 10;
predigit = int(sum[0] / 10);
predigits.push_back(predigit);
if(predigit < 9){
for(i = 0; i < (predigits.size() - 1); i++){
cout << predigits[i];
}
predigits.clear();
predigits.push_back(predigit);
}
if(predigit == 10){
for(i = 0; i < (predigits.size() - 1); i++){
if(predigits[i] == 9){
cout << 0;
}
else
{
predigits[i] = predigits[i] + 1;
cout << predigits[i];
}
}
predigits.clear();
predigits.push_back(0);
}
}
cout << "\nTime elapsed: " << ((clock() - start) / double(CLOCKS_PER_SEC)) << "\n";
system("pause");
return;
}
-
May 11th, 2012, 05:31 PM
#5
Re: Optimising Code
Originally Posted by SamstaUK
Okay I ran the program 5 times, here is the average time for calculating pi to 10000 decimal places:
1) Please format your code before posting.
2) You didn't mention the compiler or the compiler version you're using.
3) Your code has errors -- a
a) main() returns an int, not void.
b) Given a), you can't return void from main(). Your return statement is a void return, and that is not allowed if main() must return an int.
4) Your code does not calculate just the time to compute pi to 10000 places.
To point 4) why are you timing output statements such as cout? Get rid of it. That by itself adds seconds to the amount of time. After getting rid of the cout statements in the middle of the calculation, the time is cut down to a little over 9 seconds. (This is with Visual Studio 2010, optimized build).
9.159 seconds.
Regards,
Paul McKenzie
-
May 11th, 2012, 06:18 PM
#6
Re: Optimising Code
Originally Posted by Paul McKenzie
1) Please format your code before posting.
2) You didn't mention the compiler or the compiler version you're using.
3) Your code has errors -- a
a) main() returns an int, not void.
b) Given a), you can't return void from main(). Your return statement is a void return, and that is not allowed if main() must return an int.
4) Your code does not calculate just the time to compute pi to 10000 places.
To point 4) why are you timing output statements such as cout? Get rid of it. That by itself adds seconds to the amount of time. After getting rid of the cout statements in the middle of the calculation, the time is cut down to a little over 9 seconds. (This is with Visual Studio 2010, optimized build).
9.159 seconds.
Regards,
Paul McKenzie
1 - It was formatted correctly when I pasted it from notepad++, however after posting all the tabs have been removed
2 - I'm using visual studio 2010
3/4 - I'll fix those and post back with the updated code
-
May 11th, 2012, 06:50 PM
#7
Re: Optimising Code
I got it down to 8.3 seconds, thank you It's still faster on my laptop than a server with 2 quad core processors for some reason though!
How complicated would making this program use multiple threads to calculate be? Should I look into it or be content with 8.3 seconds?
Here is the updated code:
Code:
#include <iostream>
#include <vector>
#include <math.h>
#include <time.h>
using namespace std;
int main(){
vector<int> B;
vector<int> remainder;
vector<int> m10;
vector<int> sum;
vector<int> carried;
vector<int> predigits;
vector<int> pi;
double digits;
double size;
int predigit;
int counter = 3;
int i;
int x;
B.push_back(0);
cout << "Number of digits of pi to calculate: ";
cin >> digits;
clock_t start = clock();
size = digits / log10(double(2));
for(i = 0; i <= int(size); i++){
remainder.push_back(2);
B.push_back(counter);
carried.push_back(0);
m10.push_back(0);
sum.push_back(0);
counter += 2;
}
for(x = 1; x <= (digits + 1); x++){
for(i = int(size); i >= 1; i--){
m10[i] = remainder[i] * 10;
sum[i] = m10[i] + carried[i];
div_t ans = div(sum[i], B[i]);
remainder[i] = ans.rem;
carried[i - 1] = ans.quot * i;
}
m10[0] = remainder[0] * 10;
sum[0] = m10[0] + carried[0];
remainder[0] = sum[0] % 10;
predigit = int(sum[0] / 10);
predigits.push_back(predigit);
if(predigit < 9){
for(i = 0; i < (predigits.size() - 1); i++){
pi.push_back(predigits[i]);
}
predigits.clear();
predigits.push_back(predigit);
}
if(predigit == 10){
for(i = 0; i < (predigits.size() - 1); i++){
if(predigits[i] == 9){
pi.push_back(0);
}
else
{
predigits[i] = predigits[i] + 1;
pi.push_back(predigits[i]);
}
}
predigits.clear();
predigits.push_back(0);
}
}
cout << "\nTime elapsed: " << ((clock() - start) / double(CLOCKS_PER_SEC)) << "\n";
system("pause");
for(i = 0; i < pi.size(); i++){
printf("%u", pi[i]);
}
system("pause");
return 0;
}
-
May 11th, 2012, 07:43 PM
#8
Re: Optimising Code
You can save a couple more seconds by using div() in one more place, and by pre-allocating all your vectors since you know their sizes.
Vlad - MS MVP [2007 - 2012] - www.FeinSoftware.com
Convenience and productivity tools for Microsoft Visual Studio:
FeinWindows - replacement windows manager for Visual Studio, and more...
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|