|
-
April 29th, 2009, 07:21 AM
#1
Inaccuracy of QueryPerformanceCounter in "small" functions
Hi,
I need to check the performance of a code snippet, and for this reason, I need to measure the execution time or cycles.
I learned that the function QueryPerformanceCounter retrieves the current value of the high-resolution performance counter. I used it before executing a loop, and afterward. Subtracting the latter value from the former should give me the time (or cycles) spent to execute the loop.
The results were quiet surprising. Below you can find the code followed by a typical output.
When I'm executing small loops, such as 10 or 100 times, the execution time of a loop and another loop that is only 10 times bigger, are not proportional at all!
However when the loop increases to 10,000 or 100,000, the execution time begins to appear proportional and increases by 10, just as the amount the loop size increases.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
#include <time.h>
#include <windows.h>
#include <winbase.h>
int main()
{
LARGE_INTEGER ticksPerSecond;
LARGE_INTEGER start_ticks, end_ticks, cputime;
double delay_xor;
int j;
long l1,l2,l3;
l1=0xABCDEF12; //random 32 bits values in hexadecimals
l2=0x12345678;
l3=0xAB12CD34;
j = sizeof(double); //8 bytes
j = sizeof(long); //4 bytes
j = sizeof(LARGE_INTEGER); //8 bytes
QueryPerformanceCounter(&start_ticks);
for(j=0;j<1;j++) //loop once
l3 = l1^l2+j; QueryPerformanceCounter(&end_ticks);
cputime.QuadPart = end_ticks.QuadPart- start_ticks.QuadPart;
delay_xor = (double)cputime.QuadPart;
printf("\n First Loop: %.2f",delay_xor);
QueryPerformanceCounter(&start_ticks);
for(j=0;j<10;j++) //loop 10 times
l3 = l1^l2+j; QueryPerformanceCounter(&end_ticks);
cputime.QuadPart = end_ticks.QuadPart- start_ticks.QuadPart;
delay_xor = (double)cputime.QuadPart;
printf("\nSecond Loop: %.2f",delay_xor);
QueryPerformanceCounter(&start_ticks);
for(j=0;j<100;j++) //loop 100 times
l3 = l1^l2+j; QueryPerformanceCounter(&end_ticks);
cputime.QuadPart = end_ticks.QuadPart- start_ticks.QuadPart;
delay_xor = (double)cputime.QuadPart;
printf("\n Third Loop: %.2f",delay_xor);
QueryPerformanceCounter(&start_ticks);
for(j=0;j<1000;j++) //loop 1000 times
l3 = l1^l2+j; QueryPerformanceCounter(&end_ticks);
cputime.QuadPart = end_ticks.QuadPart- start_ticks.QuadPart;
delay_xor = (double)cputime.QuadPart;
printf("\nFourth Loop: %.2f\n",delay_xor);
QueryPerformanceCounter(&start_ticks);
for(j=0;j<10000;j++) //loop 10000 times
l3 = l1^l2+j; QueryPerformanceCounter(&end_ticks);
cputime.QuadPart = end_ticks.QuadPart- start_ticks.QuadPart;
delay_xor = (double)cputime.QuadPart;
printf("\n Fifth Loop: %.2f",delay_xor);
QueryPerformanceCounter(&start_ticks);
for(j=0;j<100000;j++) //loop 100000 times
l3 = l1^l2+j; QueryPerformanceCounter(&end_ticks);
cputime.QuadPart = end_ticks.QuadPart- start_ticks.QuadPart;
delay_xor = (double)cputime.QuadPart;
printf("\n Sixth Loop: %.2f",delay_xor);
QueryPerformanceCounter(&start_ticks);
for(j=0;j<1000000;j++) //loop 100000 times
l3 = l1^l2+j; QueryPerformanceCounter(&end_ticks);
cputime.QuadPart = end_ticks.QuadPart- start_ticks.QuadPart;
delay_xor = (double)cputime.QuadPart;
printf("\nSeventh Loop: %.2f\n\n",delay_xor);
return 0;
Please note the output:
First Loop: 18.00
Second Loop: 21.00
Third Loop: 23.00
Fourth Loop: 62.00
Fifth Loop: 452.00
Sixth Loop: 4331.00
Seventh Loop: 44476.00
Press any key to continue . . .
Is there any explanation of the non-linearity of the outcome in small loops?
Is there any other accurate alternative to measure the execution time of small loops?
Thanks in advance.
Last edited by Elias.pp; April 29th, 2009 at 07:30 AM.
Reason: Typo
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|