Compiler auto correct uninitialized variable? (gcc does and VC++ doesn't)
I had a flawed function like this:
Code:
fn(){
char c;
if (runFirstTime){
#ifdef VC
c='\\';
#else
c='/';
#endif
}
... // c is used in the rest of the function to construct some pathnames
}
The problem is that the value of c is not defined the 2nd time the function is called (and subsequently). It somehow worked fine under CygWin compiled with gcc. So I didn't spot the flaw until it ran incorrectly under Windows complied with VC++ 2010. Then I found the error and changed the code to something like
So now it works correctly under Windows. Then I re-compiled the new code with gcc and to my surprise gcc produced exactly the same binary! How can this be? Does the gcc compiler see my flaw and fix it for me somehow? If so I am truly amazed.
Re: Compiler auto correct uninitialized variable? (gcc does and VC++ doesn't)
I don't know what could cause that behaviour but MSVC does handle paths with '/' as good as '\\' these days so you shouldn't have to do that defines.
Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are, by
definition, not smart enough to debug it.
- Brian W. Kernighan
Re: Compiler auto correct uninitialized variable? (gcc does and VC++ doesn't)
Originally Posted by acppdummy
Thanks! Does MSVC take mixture of '/' and '\\'?
For standard I/O functions, the forward slash is supported by practically all compilers, regardless of the platform. For example:
Code:
fopen("dir1/dir2/dir3/test.txt", "r");
Works for practically every compiler and on every system that supports directories (Windows, Linux, Mac, etc.).
The issue for Windows is that there are specific Windows API functions that handle directory names. These functions are iffy, as there are some will take forward slashes, while others only accept the backslash character.
Re: Compiler auto correct uninitialized variable? (gcc does and VC++ doesn't)
Thanks for the comments about '/' vs '\\'.
Now I am still fascinated by the fact that g++ produced identical binaries from codes with different logic. So I made a small example project and tested it again. Sure enough g++ produced identical binaries from the two versions of the function (one good and one bad logic) again:
Note that in the 2nd case the value of the char dlm should be set only when the if statement holds true, otherwise it should be undefined. However g++ seems to either treat it as a static variable or simply change the logic (otherwise how come the binaries are identical?). I even wonder if there might be a bug in g++?
I attached all files from this test project in a zip file. The two different versions of the function is in function.cpp and function2.cpp - obviously use only one of them at a time. Could one of you experts please take a look and shed some light on what's going on? By the way binaries produced by VC2010 behave as expected (different for the two cases). Thanks!
PS. Output from running the VC2010 binaries; first bad (note the bad character), second good:
Re: Compiler auto correct uninitialized variable? (gcc does and VC++ doesn't)
Originally Posted by acppdummy
Note that in the 2nd case the value of the char dlm should be set only when the if statement holds true, otherwise it should be undefined. However g++ seems to either treat it as a static variable or simply change the logic (otherwise how come the binaries are identical?). I even wonder if there might be a bug in g++?
I suggest that you fix the bug in your code (function.cpp) before you ask if there is a bug in g++
After all, since dlm is used either way, g++ could have moved the code that assigns '/' to dlm outside of the body of the if statement since, as you rightly point out, the value of dlm is otherwise not well defined. This would not be a bug with g++ since it had the right to give dlm any other initial value anyway.
C + C++ Compiler: MinGW port of GCC
Build + Version Control System: SCons + Bazaar
Re: Compiler auto correct uninitialized variable? (gcc does and VC++ doesn't)
Originally Posted by acppdummy
Thanks for the comments about '/' vs '\\'.
Now I am still fascinated by the fact that g++ produced identical binaries from codes with different logic.
How do you know this is a "fact"?
To remove all doubt, why not produce the preprocessed output from the compiler. Visual C++ has a command-line option to do this, and I guess g++ has one also. Then you will see exactly what the compiler is compiling, instead of taking a guess which lines are being used in the compilation.
Re: Compiler auto correct uninitialized variable? (gcc does and VC++ doesn't)
Originally Posted by acppdummy
I even wonder if there might be a bug in g++?
If g++ makes such a mistake with such simple code, then many, possibly thousands of programmers would have known about it and reported it. That compiler is used by thousands of programmers, a huge number of companies, using C++ in all sorts of ways -- it is highly doubtful that a simple bug like this would exist in the compiler.
But the bottom line is best spelled out by the title you gave the thread -- "Compiler auto correct uninitialized variable" . When a variable is uninitialized and you attempt to use this variable, then anything can occur. There is no "auto-correction" -- all you're seeing is undefined behaviour that you happen to approve of in one case, and disapprove of in another case.
Here is a question for you: what do you expect to be printed by the following program:
Code:
#include <iostream>
class foo
{
bool bSet;
public:
void print()
{
if ( bSet )
std::cout << "bSet is true" << std::endl;
else
std::cout << "bSet is false" << std::endl;
}
};
int main()
{
foo f;
f.print();
}
Since bSet is uninitialized, there is no guarantee what should be printed. If you expected bSet to be true, and you get the "bSet is true" printed, that is not "auto-correction" -- that's called "luck".
Regards,
Paul McKenzie
Last edited by Paul McKenzie; July 15th, 2012 at 10:26 AM.
Re: Compiler auto correct uninitialized variable? (gcc does and VC++ doesn't)
First of all sorry for my stupid comment about g++ having a "bug". I know it is a well built tool.
Originally Posted by Paul McKenzie
How do you know this is a "fact"?
To remove all doubt, why not produce the preprocessed output from the compiler. Visual C++ has a command-line option to do this, and I guess g++ has one also. Then you will see exactly what the compiler is compiling, instead of taking a guess which lines are being used in the compilation.
Regards,
Paul McKenzie
I know the binaries are the same because I use BeyondCompare to do binary comparisons. When I first noticed this with my original project I found the final executable to the identical except the date/time code near the beginning of the binary. For this test project I just compared function.o and function2.o, which are object files compiled from function.cpp and function2.cpp, respectively. (Built separately of course.) The two object files are identical except the embedded filename (function.cpp vs function2.cpp) in the binary.
I will be happy to check the preprocessed output if I figure out how to do it. Thanks!
Last edited by acppdummy; July 15th, 2012 at 11:11 AM.
Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are, by
definition, not smart enough to debug it.
- Brian W. Kernighan
Re: Compiler auto correct uninitialized variable? (gcc does and VC++ doesn't)
Originally Posted by acppdummy
Note that in the 2nd case the value of the char dlm should be set only when the if statement holds true, otherwise it should be undefined. However g++ seems to either treat it as a static variable or simply change the logic (otherwise how come the binaries are identical?). I even wonder if there might be a bug in g++?
No:
"in the 2nd case the value of the char dlm will be set [s]only[/s] when the if statement holds true".
"otherwise it should be undefined": "undefined" is not a special spectral value that crashes a program. It just means the compiler is free to do anything it wants to the variable, including also setting it when the statement does not hold true.
Long story short, the only restraint a compiler really has is to preserve the observed defined behavior. For example this program:
Code:
void setstr(){
char dlm='/';
if (strlen(str1)==0){
sprintf(str1,"drive/f/dir1");;
}
sprintf(str2,"%s%c%s",str1,dlm,"name1");
}
Creates the same observable behavior when the first if is true, so the compiler is free to reinterpret your program as such.
GCC did not change the logic. It just had a "different interpretation" of the logic.
Last edited by monarch_dodra; July 15th, 2012 at 04:32 PM.
Is your question related to IO?
Read this C++ FAQ article at parashift by Marshall Cline. In particular points 1-6.
It will explain how to correctly deal with IO, how to validate input, and why you shouldn't count on "while(!in.eof())". And it always makes for excellent reading.
Hello all, Thank you very much for the comments. I learned a lot from this forum as always. Here is what I found:
Using the -S switch I let g++ produce assembly codes for each of the two functions. Then I BeyondCompared them and quoted the full report below. The good algorithm is on the left hand side and the bad algorithm on the right. Note the assignment of the '/' character ($47) is outside of the if statement (jne L2) for the good case and inside for the bad case, just as one would expect from the C code. So the compiler seems to do a faithful translation of the C code to assembly code. But how can the different assembly codes end up as identical object files?
Then I remembered that I was using an IDE to build my projects so I went back and looked and saw it using -O2 option among others. So I redid the -S together with -O2 and sure enough this time around the assembly codes are identical except the imbedded filenames. See full comparison report below. The optimizer seems really smart: it sees a local variable (dlm) that is assigned a value only once within scope, so it just replaces it with a constant, resulting in exactly the same assembly code. This also turns the bad algorithm into a good one. I am so amazed!
The optimizer seems really smart: it sees a local variable (dlm) that is assigned a value only once within scope, so it just replaces it with a constant, resulting in exactly the same assembly code. This also turns the bad algorithm into a good one. I am so amazed!
And what if that uninitialized variable was supposed to be initialized to a different character, but the programmer forgot to do so? Would the optimizer acting this way be smart? Your situation is just luck.
Please don't rely on uninitialized variables to "auto-correct" themselves. What you have is a plain-old bug that would have needed to be fixed. Looking at assembly language is usually used when programs that are well-defined and must behave a certain way do not behave a certain way -- it is hardly if ever used to figure out why undefined behaviour is what it is.
Re: Compiler auto correct uninitialized variable? (gcc does and VC++ doesn't)
Thanks! Point well taken - I already fixed the code when VC++ did not do the "smart optimization" and exposed the bug. The exercise of looking at the assembly code was just to understand why g++ produced the same binary from two apparently quite different algorithms (when I first found the binary did not change after the code fix). Thanks again!
* The Best Reasons to Target Windows 8
Learn some of the best reasons why you should seriously consider bringing your Android mobile development expertise to bear on the Windows 8 platform.