|
-
June 1st, 2006, 03:35 PM
#1
cpu cycles
i was just wondering what takes more cpu cycles accessing a variable trough a class pointer or a 'scoped' variable...
Code:
struct XD
{
int VAL;
XD(const int CPY/*, XD* this*/)
{
/*this->*/VAL = CPY;
//from now on shuld either use CPY or VAL.
printf("%i", /*this->*/VAL); // does this use more cpu cycles
printf("%i", CPY); // than this?
}
};
-
June 1st, 2006, 03:40 PM
#2
Re: cpu cycles
im asking becus logicly the CPY will be in the registers while VAL needs to be dereferenced?
-
June 1st, 2006, 04:15 PM
#3
Re: cpu cycles
Well, in your example it probably doesn't matter whether you use
Code:
printf("%i", VAL); // does this use more cpu cycles
or
Code:
printf("%i", CPY); // than this?
because the compiler knows that this->VAL equals CPY and will try to use that information the best way it can (it might use CPY instead of VAL even if your code says VAL).
- petter
-
June 1st, 2006, 04:19 PM
#4
Re: cpu cycles
As Petter has indicated, the compiler will be just as efficient using the assigned pointer.
Please recall that the call to printf will absolutely dominate the run-time of your code sequences no matter how the print information is defined.
Sincerely, Chris.
You're gonna go blind staring into that box all day.
-
June 1st, 2006, 04:31 PM
#5
Re: cpu cycles
 Originally Posted by wildfrog
Well, in your example it probably doesn't matter whether you use
Code:
printf("%i", VAL); // does this use more cpu cycles
or
Code:
printf("%i", CPY); // than this?
because the compiler knows that this->VAL equals CPY and will try to use that information the best way it can (it might use CPY instead of VAL even if your code says VAL).
- petter
what if the compiler doesnt know ??
-
June 1st, 2006, 05:05 PM
#6
Re: cpu cycles
what if the compiler doesnt know ??
If you mean those 2 diff. variables of same type.
The below is going to take somewhat less cycyle in some cases
But how much less is going to depend upon some other factors & implementation details
Like what kind of variable? Whether compiler decide to store local variable in CPU register or not? How the C++ class structure implemented by compiler?
Here code generated by VC compiler.
It generating following code for local variable
Code:
00411AEA mov eax,dword ptr [CPY]
00411AED push eax
& Its generating below code for class member access
Code:
00411AFB mov eax,dword ptr [this]
00411AFE mov ecx,dword ptr [eax]
00411B00 push ecx
Note its doing one extra move to get first member
Some aggressive optimized compiler might give you same cycles for both
Vinod
-
June 1st, 2006, 05:16 PM
#7
Re: cpu cycles
so i still think its wiser to use the locale..
-
June 1st, 2006, 05:28 PM
#8
Re: cpu cycles
Nope. It's wiser to compile in RELEASE mode (i.e. full optimization). One, the compiler (Dev Studio 7) inlines the constructor, so there is no 'call' statement. Two, both calls to printf are (almost) exactly the same:
Code:
XD someXD(42);
00403981 push 2Ah
00403983 push 4111F4h
00403988 call printf (406246h)
0040398D push 2Ah
0040398F push 4111F8h
00403994 call printf (406246h)
Viggy
-
June 1st, 2006, 06:20 PM
#9
Re: cpu cycles
 Originally Posted by MrViggy
Nope. It's wiser to compile in RELEASE mode (i.e. full optimization). One, the compiler (Dev Studio 7) inlines the constructor, so there is no 'call' statement. Two, both calls to printf are (almost) exactly the same:
Code:
XD someXD(42);
00403981 push 2Ah
00403983 push 4111F4h
00403988 call printf (406246h)
0040398D push 2Ah
0040398F push 4111F8h
00403994 call printf (406246h)
Viggy
but you cannot rely on the compiler to much, and there is no down side using CPY instead of VAL, you never know how the compiler will optimise
-
June 1st, 2006, 07:00 PM
#10
Re: cpu cycles
To be honest, Mitsukai, you're asking a rather pointless question.
There's only a tiny, insignificant difference in performance... if any at all. The performance difference is so small that it's not even worth thinking about unless you've already used a profiler and it's shown that a lot of time has been spent doing that kind of thing.
Since this kind of code happens a lot, a lot of time has been spent making optimisers deal with it well. This makes it all very unpredictable: your optimiser might decide it's all so trivial that it'll just inline the lot and then optimise that... in which case it will take different amounts of time depending on context.
Don't sweat the small stuff. Use algorithms and data structures which make sense for the job at hand, and if it's too slow use a profiler to find where the program is spending it's time.
If /that/ is your biggest bottleneck, I'd say that you had a program as optimised as it's every likely to get!
Ian W
-
June 1st, 2006, 07:10 PM
#11
Re: cpu cycles
 Originally Posted by Noddon
To be honest, Mitsukai, you're asking a rather pointless question.
i dont see how its pointless. Alot of software engineers are probally thinking like you, and are infact right now unnecesarily decreasing my cpu's life time...
 Originally Posted by Noddon
There's only a tiny, insignificant difference in performance... if any at all. The performance difference is so small that it's not even worth thinking about unless you've already used a profiler and it's shown that a lot of time has been spent doing that kind of thing.
the issue is, as a single program it might not be make any diffrence in speed but, the speed will decrease as the cpu is busy. And with other applications become busier. Maybe it will unnecesarily occupy cpu cycles when your doing a huge data transfer wich will take longer...
Its also to know what happens behind the scene... Though i do not see any harm of using the local variable wich is restored in the register directly. Instead of using the class member and possibly waste a cpu cycle or less becus the compiler possibly dint optimize it.
 Originally Posted by Noddon
Since this kind of code happens a lot, a lot of time has been spent making optimisers deal with it well. This makes it all very unpredictable: your optimiser might decide it's all so trivial that it'll just inline the lot and then optimise that... in which case it will take different amounts of time depending on context.
you sayd my axact point... its unpredictable if the compiler will optimize it or not. Why not easily forget about it and use the local preferbely?
 Originally Posted by Noddon
Don't sweat the small stuff. Use algorithms and data structures which make sense for the job at hand, and if it's too slow use a profiler to find where the program is spending it's time.
currently im in no big bussiness and got all the time on my hands to "waste", learning on how compilers work and what could possibly optimise your code.
 Originally Posted by Noddon
If /that/ is your biggest bottleneck, I'd say that you had a program as optimised as it's every likely to get!
if all programmers did... my pc might have worked and booted faster...
-
June 1st, 2006, 07:52 PM
#12
Re: cpu cycles
 Originally Posted by Mitsukai
i dont see how its pointless. Alot of software engineers are probally thinking like you, and are infact right now unnecesarily decreasing my cpu's life time...
Software engineers worthy of the name know when to optimise.
Optimisers are very good at mico-optimisation, picking the right instructions to get the best local performance. Software engineers let the optimisers do their job and instead perform the optimisations the optimiser cannot do: picking good algorithms and data structures. Only when they know they can do a better job than the optimiser do software engineers worry about micro-optimisation.
you sayd my axact point... its unpredictable if the compiler will optimize it or not. Why not easily forget about it and use the local preferbely?
It's unpredictable as to the exact output the optimiser will produce for some arbitrary input. What it will do though is /optimise/! The reason why it will produce different results in different contexts is because it's trying to pick the most optimal output.
In other words, it produces different output that is as "good" as it knows how to produce. Something as trivially different as using an automatic variable rather than a member variable will just be lost when the optimiser makes it as fast as possible within the context it is used.
if all programmers did... my pc might have worked and booted faster...
It might if they used optimisers. If I ran an optimiser on your example, I'd get results something like the following:
EDIT: Of course, I mean profiler... PROFILER. Gah... sleepy brain=silly mistakes.
1 call to XD(), Total time spent in XD(): 18 cycles (0.01%)
2 calls to printf(), Total time spent in printf(): 180,000 cycles(99.99%)
Reducing the time that printf() takes would be an optimisation. Shaving cycle or two from XD() is rather pointless.
Trying to shave a cycle or two from XD() when the optimiser, a piece of code very very good at optimising, has already done it's work is also rather pointless: you're unlikely to do better than the optimiser.
However, using "cout" instead of "printf()" may be a huge optimisation (comparatively). By using "cout" and the overloaded "<<" operator, the compiler can simply call the appropriate function to display an integer. On the other hand, "printf()" has to parse the format string to know it's going to be outputing an integer. You can save a whole load of runtime processing!
Micro-optimisations like picking the instructions to use for XD() are what optimisers are good at. Let them do it. Only when micro-optimisation is not enough is it time to consider changing the code.
Last edited by Noddon; June 1st, 2006 at 08:08 PM.
-
June 1st, 2006, 08:49 PM
#13
Re: cpu cycles
 Originally Posted by Mitsukai
i dont see how its pointless. Alot of software engineers are probally thinking like you, and are infact right now unnecesarily decreasing my cpu's life time...
No, what those software engineers are doing wrong is using inefficent algorithms, and not that they are or are not using a register here and there.
Honestly, if you were to work on a project with other programmers, they are not looking for these types of micro-optimizations that seem to concern you. What they are looking for are more efficient ways of allocating memory, sorting, organizing data, etc. That is where the speed gains come into play.
the issue is, as a single program it might not be make any diffrence in speed but, the speed will decrease as the cpu is busy. And with other applications become busier. Maybe it will unnecesarily occupy cpu cycles when your doing a huge data transfer wich will take longer...
Have you ever used a profiler? If you have, you will see that what you are worried about takes no time. What takes time, again, is the algorithm that you've used and other things such as calling new 100,000 times when you can get by calling it once or twice, passing or returning big objects by value, etc. These are the things that bring a CPU to its knees.
And you mention "what if the compiler doesn't optimize"? OK, I'll ask the opposite question -- what if the compiler did optimize?
I'll ask another -- what if the compiler didn't optimize, you wrote code, and then on another compiler, the code you wrote didn't allow the compiler to make optimizations, effectively making the code inefficent? So are you going to rewrite the code again? And for what gain (or possible loss)? A few nanoseconds?
Compiler optimization is something that is a basic part of compiler technology. If you've ever taken a course on compilers, and if it is a two semester course, the second semester is usually dedicated to compiler optimizations -- that shows you how important it is. From what you've posted, you seem to think that compiler optimization is a "throwaway" piece that compiler writers don't care about, so they do a sloppy or in extreme cases, no job on optimization. You couldn't be more wrong.
Regards,
Paul McKenzie
-
June 2nd, 2006, 06:38 AM
#14
Re: cpu cycles
First, you should NEVER EVER evaluate the code of the speed in Visual C++ without optimizations.
The default code is FAR WORST than the code produced by Turbo C++ 1.0.
Without optimizations, benchmarks have no meaning... You must at least turn on the very very basic optimizations. For instance, simply use the worst compiler on your platform, and compile with optimizations.
Otherwise very often, you'll see things like:
code A is 1.5 times faster than code B without optimizations.
code A is 5 times slower than code B with optimizations, which is itself 3 times faster than code B without optimizations.
 Originally Posted by Mitsukai
Its also to know what happens behind the scene... Though i do not see any harm of using the local variable wich is restored in the register directly. Instead of using the class member and possibly waste a cpu cycle or less becus the compiler possibly dint optimize it.
You forget one thing.
access to automatic variables may be SLOWER than access to member variables.
For instance, if you use the __thiscall calling convention or any __fastcall calling convention, there will not be any additional level of indirection.
And, for instance, if you use automatic variables, your compiler may have a lack of registers and need to install a stack frame.
Write things that make sense.
If you want to optimize a critical routine, you should not use such stupid tricks... You should write it in assembly code... The benefit can be really noticeable (from 0% to more than 50% sometimes, with all intermediate speed gains).
A general rule is that an artificial code which says about the same thing that the natural code is less likely to produce efficient code, because compilers have been created to optimize "common" code.
You must also profile your code.
And, instead of using printf, you should use cout (with Borland C++ 5.0 it is MUCH faster)... And if it is critical, you may try to code your own displaying routine... And see if it is faster.
Anyway, you'll see that micro-optimizations are completely stupid most of the time, because even when profiling, if it is a too small micro-optimization, for memory usage or other considerations, the 0.001% performance gain may turn to a 0.01% performance lost with other data set, or on another computer.
"inherit to be reused by code that uses the base class, not to reuse base class code", Sutter and Alexandrescu, C++ Coding Standards.
Club of lovers of the C++ typecasts cute syntax: Only recorded member.
Out of memory happens! Handle it properly!
Say no to g_new()!
-
June 2nd, 2006, 06:52 AM
#15
Re: cpu cycles
Mitsukai:
Note also that the extra level of indirection of member access is only required ONCE in the function.
And that this level of indirection is not even required if the function uses a not-too-bad calling convention (thiscall, fastcall, or simply, an inline function).
Furthermore if you pass data as parameter, you'll have extra code on the callee side.
And since this extra code increases the code size (and slow down the program) proportionally to the number of calls, it will be LESS efficient than using a member variable.
Futhermore, if you don't pass the parameter as argument, but extract it in the function, like:
Code:
void MemberFunc() {
int i=member;
// do work on i
}
I claim that, for a correct compiler, it won't make any difference.
But for a bad optimizer (or even for a good compiler if the code of the function is large enough and that i can't be put in a register), it will be SLOWER.
Because, i will not be put into a register, which will require to extra dependent mov operations:
Code:
; assume that ecx contains this
mov eax,[ecx+offset]
mov [ebp-4],eax
And if it is the first non-register automatic variable, it will require at least a stack frame (with at least two instructions).
While using directly the member would be better, because the this pointer would be used a lot (and put in the register), and all accesses to all member variables would be done via this register.
Guideline : especially on machines which have few registers, avoid having aliases of variables.
And, of course, this method seems obviously stupid when there are a lot of members:
Code:
struct {
int a,b,c,d,e,f,g;
};
int func() {
int la=a,lb=b,lc=c,ld=d,le=e,lf=f,lg=g; // You reinvented pass-by-value with extra overhead ?
}
"inherit to be reused by code that uses the base class, not to reuse base class code", Sutter and Alexandrescu, C++ Coding Standards.
Club of lovers of the C++ typecasts cute syntax: Only recorded member.
Out of memory happens! Handle it properly!
Say no to g_new()!
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|