|
-
May 1st, 2015, 03:43 AM
#1
Code runs 2x faster on windows than on linux
I have a piece of code that runs 2x faster on windows than on linux. Here are the times I measured:
g++ -Ofast -march=native -m64
29.1123
g++ -Ofast -march=native
29.0497
clang++ -Ofast -march=native
28.9192
visual studio 2013 Debug 32b
13.8802
visual studio 2013 Release 32b
12.5569
It really seems to be too huge a difference.
Here is the code:
#include <iostream>
#include <map>
#include <chrono>
static std::size_t Count = 1000;
static std::size_t MaxNum = 50000000;
bool IsPrime(std::size_t num)
{
for (std::size_t i = 2; i < num; i++)
{
if (num % i == 0)
return false;
}
return true;
}
int main()
{
auto start = std::chrono::steady_clock::now();
std::map<std::size_t, bool> value;
for (std::size_t i = 0; i < Count; i++)
{
value[i] = IsPrime(i);
value[MaxNum - i] = IsPrime(MaxNum - i);
}
std::chrono: uration<double> serialTime = std::chrono::steady_clock::now() - start;
std::cout << "Serial time = " << serialTime.count() << std::endl;
system("pause");
return 0;
}
All of this was measured on the same machine with windows 8 vs linux 3.19.5(gcc 4.9.2, clang 3.5.0). Both linux and windows are 64bit.
What could be the reason for this? Some scheduler issues?
EDIT: It was caused by building 32b binaries on windows as opposed to 64b binaries on linux, here are 64b numbers for windows:
Visual studio 2013 Debug 64b
29.1985
Visual studio 2013 Release 64b
29.7469
-
May 1st, 2015, 08:16 AM
#2
Re: Code runs 2x faster on windows than on linux
 Originally Posted by Anthonyemuobo
visual studio 2013 Release 32b
12.5569
...
Visual studio 2013 Release 64b
29.7469
Now somebody has to explain why 64-bit version takes that much longer.
Vlad - MS MVP [2007 - 2012] - www.FeinSoftware.com
Convenience and productivity tools for Microsoft Visual Studio:
FeinWindows - replacement windows manager for Visual Studio, and more...
-
May 1st, 2015, 09:47 AM
#3
Re: Code runs 2x faster on windows than on linux
Let me take a shot at it myself.
I have eliminated the map and looping, here is the shortest sample I got:
Code:
bool IsPrime(std::size_t num)
{
for (std::size_t i = 2; i < num; i++)
{
if (num % i == 0)
return false;
}
return true;
}
int main()
{
auto start = std::chrono::steady_clock::now();
bool b = IsPrime(49999991);
std::chrono::duration<double> serialTime = std::chrono::steady_clock::now() - start;
std::cout << "Serial time = " << serialTime.count() << std::endl;
system("pause");
return b;
}
It takes 0.352..0.398 sec in 32-bit and 0.825 sec in 64-bit.
I know there are twice as many bits to work with, but I didn't know it takes twice the time!
This example takes 0.365..0.401 sec in 64-bit if the parameter type to IsPrime() is changed from std::size_t to unsigned int.
Vlad - MS MVP [2007 - 2012] - www.FeinSoftware.com
Convenience and productivity tools for Microsoft Visual Studio:
FeinWindows - replacement windows manager for Visual Studio, and more...
-
May 1st, 2015, 11:40 AM
#4
Re: Code runs 2x faster on windows than on linux
Using Vladimir#s code from post #3 with MSVS 2013 release optimised I get
32 bit - 0.19
64 bit - 0.42
Replacing size_t with unsigned int gives
32 bit - 0.19
64 bit - 0.19
However replacing size_t with unsigned long long gives
32 bit - 0.38
64 bit - 0.43
All advice is offered in good faith only. All my code is tested (unless stated explicitly otherwise) with the latest version of Microsoft Visual Studio (using the supported features of the latest standard) and is offered as examples only - not as production quality. I cannot offer advice regarding any other c/c++ compiler/IDE or incompatibilities with VS. You are ultimately responsible for the effects of your programs and the integrity of the machines they run on. Anything I post, code snippets, advice, etc is licensed as Public Domain https://creativecommons.org/publicdomain/zero/1.0/ and can be used without reference or acknowledgement. Also note that I only provide advice and guidance via the forums - and not via private messages!
C++23 Compiler: Microsoft VS2022 (17.6.5)
-
May 1st, 2015, 04:04 PM
#5
Re: Code runs 2x faster on windows than on linux
@2kaud, can you post the generated asm ? maybe, a smaller size allows the compiler to unroll the loop and perform two loop cycles in a single op ...
-
May 1st, 2015, 04:34 PM
#6
Re: Code runs 2x faster on windows than on linux
Is it OK if I will answer?
There is only one loop, and there is no unrolling.
32-bit
Code:
bool b = IsPrime(49999991);
00911047 mov ecx,2
0091104C lea esp,[esp]
00911050 xor edx,edx
00911052 mov eax,2FAF077h
00911057 div eax,ecx
00911059 test edx,edx
0091105B je main+3Ah (091106Ah)
0091105D inc ecx
0091105E cmp ecx,2FAF077h
00911064 jb main+20h (0911050h)
00911066 mov bl,1
00911068 jmp main+3Ch (091106Ch)
0091106A xor bl,bl
64-bit
Code:
bool b = IsPrime(49999991);
000000013FAF1050 xor edx,edx
000000013FAF1052 mov eax,2FAF077h
000000013FAF1057 div rax,rcx
000000013FAF105A test rdx,rdx
000000013FAF105D je main+40h (013FAF1070h)
000000013FAF105F inc rcx
000000013FAF1062 cmp rcx,2FAF077h
000000013FAF1069 jb main+20h (013FAF1050h)
000000013FAF106B mov dil,1
000000013FAF106E jmp main+43h (013FAF1073h)
000000013FAF1070 xor dil,dil
Vlad - MS MVP [2007 - 2012] - www.FeinSoftware.com
Convenience and productivity tools for Microsoft Visual Studio:
FeinWindows - replacement windows manager for Visual Studio, and more...
-
May 2nd, 2015, 05:09 AM
#7
Re: Code runs 2x faster on windows than on linux
ok, no compiler magic involved; googling around, it appears that integer division is a special beast with data dependent latency ( as far as I got it, the cpu optimizes division for smaller sizes ) with reported, worst case latencies in 32 vs 64 bits unsigneds div's ranging from 2x to 3x, depending on cpu ... nice
-
May 4th, 2015, 07:13 AM
#8
Re: Code runs 2x faster on windows than on linux
1) 64bit code by nature will be somewhat slower than 32bit because it has more instruction prefixes which means the CPU pipelines stall.
2) 64bit code tends to be bigger because of that, so more instruction bytes need to be fetched
3) pointers are 64bit, native types are 64bit, which means your L1 cache will be flooded faster.
4) 64bit multiplication and division are slower than 32bit
When you write 64bit software, don't assume you'll get faster code, typically you won't. You basically start in a hole compared to equivalent 32Bit code, and it isn't until you can make use of 64bit specific features such as having access to more than 2Gb ram, and more efficient handling of 64bit math (if you NEED the math to be 64bit) that you have a means to get out of the hole and exceed 32Bit performance.
Specific here
size_t is 32Bit on win32 and 64bit on Win64.
so the 64bit math suffers from a larger datasize and a slower division.
change the size_t to __int64 and Win64 will blitz ahead because in Win32 the calculations now need to be simulated in software using 32Bit math.
or
change size_t to __int32 and the gap will reduce for Win64 because it now doesn't need to perform a more complex 64bit division.
But 'technology aside'. This is a horribly inefficient way to do primal-testing. It's "ok enough" if you only need to test a single (or a few) numbers, but if you want to do this for lots of numbers, it'll be very slow indeed, especially for large values.
Optimisations:
- you don't need to test for all numbers up to i<num, you only need to test for i<sqrt(num) because any number larger that that will already have been discovered by it's matching smaller counterpart. (only do the sqrt(num) once. Or flipped around, since sqrt() tends to be inefficient. test for i*i<n
- build a list of previous primes (this'll take memory to speed things up). a test for "divisible by 9" will already have been tested by division by 3., the same i true for any other multiple of 3 (and multiples of 5, 7, 11, ...).
Look up "Sieve of Eratosthenes" for one possible method for doing this. The question here is.. "can you afford the memory" ?
Last edited by OReubens; May 4th, 2015 at 07:21 AM.
Tags for this Thread
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|