CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Page 1 of 2 12 LastLast
Results 1 to 15 of 16
  1. #1
    Join Date
    Nov 2010
    Posts
    105

    Question Compiler auto correct uninitialized variable? (gcc does and VC++ doesn't)

    I had a flawed function like this:
    Code:
    fn(){
    char c;
    if (runFirstTime){
    #ifdef VC
    c='\\';
    #else
    c='/';
    #endif
    }
    ... // c is used in the rest of the function to construct some pathnames
    }
    The problem is that the value of c is not defined the 2nd time the function is called (and subsequently). It somehow worked fine under CygWin compiled with gcc. So I didn't spot the flaw until it ran incorrectly under Windows complied with VC++ 2010. Then I found the error and changed the code to something like
    Code:
    fn(){
    #ifdef VC
    const char c='\\';
    #else
    const char c='/';
    #endif
    ...
    }
    So now it works correctly under Windows. Then I re-compiled the new code with gcc and to my surprise gcc produced exactly the same binary! How can this be? Does the gcc compiler see my flaw and fix it for me somehow? If so I am truly amazed.

  2. #2
    Join Date
    Oct 2006
    Location
    Sweden
    Posts
    3,654

    Re: Compiler auto correct uninitialized variable? (gcc does and VC++ doesn't)

    I don't know what could cause that behaviour but MSVC does handle paths with '/' as good as '\\' these days so you shouldn't have to do that defines.
    Debugging is twice as hard as writing the code in the first place.
    Therefore, if you write the code as cleverly as possible, you are, by
    definition, not smart enough to debug it.
    - Brian W. Kernighan

    To enhance your chance's of getting an answer be sure to read
    http://www.codeguru.com/forum/announ...nouncementid=6
    and http://www.codeguru.com/forum/showthread.php?t=366302 before posting

    Refresh your memory on formatting tags here
    http://www.codeguru.com/forum/misc.php?do=bbcode

    Get your free MS compiler here
    https://visualstudio.microsoft.com/vs

  3. #3
    Join Date
    Nov 2010
    Posts
    105

    Re: Compiler auto correct uninitialized variable? (gcc does and VC++ doesn't)

    Thanks! Does MSVC take mixture of '/' and '\\'?

  4. #4
    Join Date
    Apr 1999
    Posts
    27,449

    Re: Compiler auto correct uninitialized variable? (gcc does and VC++ doesn't)

    Quote Originally Posted by acppdummy View Post
    Thanks! Does MSVC take mixture of '/' and '\\'?
    For standard I/O functions, the forward slash is supported by practically all compilers, regardless of the platform. For example:
    Code:
    fopen("dir1/dir2/dir3/test.txt", "r");
    Works for practically every compiler and on every system that supports directories (Windows, Linux, Mac, etc.).

    The issue for Windows is that there are specific Windows API functions that handle directory names. These functions are iffy, as there are some will take forward slashes, while others only accept the backslash character.

    Regards,

    Paul McKenzie

  5. #5
    Join Date
    Nov 2010
    Posts
    105

    Question Re: Compiler auto correct uninitialized variable? (gcc does and VC++ doesn't)

    Thanks for the comments about '/' vs '\\'.

    Now I am still fascinated by the fact that g++ produced identical binaries from codes with different logic. So I made a small example project and tested it again. Sure enough g++ produced identical binaries from the two versions of the function (one good and one bad logic) again:

    Code:
    void setstr(){
    #ifdef VC
        const char dlm='\\';
    #else
        const char dlm='/';
    #endif
        if (strlen(str1)==0){
    #ifdef VC
            sprintf(str1,"f:\\dir1");
    #else
            sprintf(str1,"drive/f/dir1");
    #endif
        }
        sprintf(str2,"%s%c%s",str1,dlm,"name1");
    }
    Code:
    void setstr(){
        char dlm;
        if (strlen(str1)==0){
    #ifdef VC
            sprintf(str1,"f:\\dir1");dlm='\\';
    #else
            sprintf(str1,"drive/f/dir1");dlm='/';
    #endif
        }
        sprintf(str2,"%s%c%s",str1,dlm,"name1");
    }
    Note that in the 2nd case the value of the char dlm should be set only when the if statement holds true, otherwise it should be undefined. However g++ seems to either treat it as a static variable or simply change the logic (otherwise how come the binaries are identical?). I even wonder if there might be a bug in g++?

    I attached all files from this test project in a zip file. The two different versions of the function is in function.cpp and function2.cpp - obviously use only one of them at a time. Could one of you experts please take a look and shed some light on what's going on? By the way binaries produced by VC2010 behave as expected (different for the two cases). Thanks!

    PS. Output from running the VC2010 binaries; first bad (note the bad character), second good:
    Code:
    2012-07-15,10:08:52,D:\NetBeansProjects_Win\testUninitializedVar\Release>testUninitializedVar_VC
    test uninitialized variable
    strlen(str)=0, str="".
    strlen(str1)=12, str1="drive/f/dir1".
    strlen(str2)=18, str2="drive/f/dir1/name1".
    strlen(str1)=12, str1="drive/d/dir2".
    strlen(str2)=18, str2="drive/d/dir2薾ame1".w
    
    2012-07-15,10:08:55,D:\NetBeansProjects_Win\testUninitializedVar\Release>testUninitializedVar_VC
    test uninitialized variable
    strlen(str)=0, str="".
    strlen(str1)=12, str1="drive/f/dir1".
    strlen(str2)=18, str2="drive/f/dir1/name1".
    strlen(str1)=12, str1="drive/d/dir2".
    strlen(str2)=18, str2="drive/d/dir2/name1".
    Attached Files Attached Files
    Last edited by acppdummy; July 15th, 2012 at 09:41 AM.

  6. #6
    Join Date
    Jan 2006
    Location
    Singapore
    Posts
    6,765

    Re: Compiler auto correct uninitialized variable? (gcc does and VC++ doesn't)

    Quote Originally Posted by acppdummy
    Note that in the 2nd case the value of the char dlm should be set only when the if statement holds true, otherwise it should be undefined. However g++ seems to either treat it as a static variable or simply change the logic (otherwise how come the binaries are identical?). I even wonder if there might be a bug in g++?
    I suggest that you fix the bug in your code (function.cpp) before you ask if there is a bug in g++

    After all, since dlm is used either way, g++ could have moved the code that assigns '/' to dlm outside of the body of the if statement since, as you rightly point out, the value of dlm is otherwise not well defined. This would not be a bug with g++ since it had the right to give dlm any other initial value anyway.
    C + C++ Compiler: MinGW port of GCC
    Build + Version Control System: SCons + Bazaar

    Look up a C/C++ Reference and learn How To Ask Questions The Smart Way
    Kindly rate my posts if you found them useful

  7. #7
    Join Date
    Apr 1999
    Posts
    27,449

    Re: Compiler auto correct uninitialized variable? (gcc does and VC++ doesn't)

    Quote Originally Posted by acppdummy View Post
    Thanks for the comments about '/' vs '\\'.

    Now I am still fascinated by the fact that g++ produced identical binaries from codes with different logic.
    How do you know this is a "fact"?

    To remove all doubt, why not produce the preprocessed output from the compiler. Visual C++ has a command-line option to do this, and I guess g++ has one also. Then you will see exactly what the compiler is compiling, instead of taking a guess which lines are being used in the compilation.

    Regards,

    Paul McKenzie

  8. #8
    Join Date
    Apr 1999
    Posts
    27,449

    Re: Compiler auto correct uninitialized variable? (gcc does and VC++ doesn't)

    Quote Originally Posted by acppdummy View Post
    I even wonder if there might be a bug in g++?
    If g++ makes such a mistake with such simple code, then many, possibly thousands of programmers would have known about it and reported it. That compiler is used by thousands of programmers, a huge number of companies, using C++ in all sorts of ways -- it is highly doubtful that a simple bug like this would exist in the compiler.

    But the bottom line is best spelled out by the title you gave the thread -- "Compiler auto correct uninitialized variable" . When a variable is uninitialized and you attempt to use this variable, then anything can occur. There is no "auto-correction" -- all you're seeing is undefined behaviour that you happen to approve of in one case, and disapprove of in another case.

    Here is a question for you: what do you expect to be printed by the following program:
    Code:
    #include <iostream>
    
    class foo
    {
       bool bSet;
       public:
            void print() 
            {
                 if ( bSet )
                   std::cout << "bSet is true" << std::endl;
                else
                   std::cout << "bSet is false" << std::endl;
            }
    };
    
    int main()
    {
       foo f;
       f.print();
    }
    Since bSet is uninitialized, there is no guarantee what should be printed. If you expected bSet to be true, and you get the "bSet is true" printed, that is not "auto-correction" -- that's called "luck".

    Regards,

    Paul McKenzie
    Last edited by Paul McKenzie; July 15th, 2012 at 10:26 AM.

  9. #9
    Join Date
    Nov 2010
    Posts
    105

    Re: Compiler auto correct uninitialized variable? (gcc does and VC++ doesn't)

    First of all sorry for my stupid comment about g++ having a "bug". I know it is a well built tool.

    Quote Originally Posted by Paul McKenzie View Post
    How do you know this is a "fact"?

    To remove all doubt, why not produce the preprocessed output from the compiler. Visual C++ has a command-line option to do this, and I guess g++ has one also. Then you will see exactly what the compiler is compiling, instead of taking a guess which lines are being used in the compilation.

    Regards,

    Paul McKenzie
    I know the binaries are the same because I use BeyondCompare to do binary comparisons. When I first noticed this with my original project I found the final executable to the identical except the date/time code near the beginning of the binary. For this test project I just compared function.o and function2.o, which are object files compiled from function.cpp and function2.cpp, respectively. (Built separately of course.) The two object files are identical except the embedded filename (function.cpp vs function2.cpp) in the binary.

    I will be happy to check the preprocessed output if I figure out how to do it. Thanks!
    Last edited by acppdummy; July 15th, 2012 at 11:11 AM.

  10. #10
    Join Date
    Jan 2006
    Location
    Singapore
    Posts
    6,765

    Re: Compiler auto correct uninitialized variable? (gcc does and VC++ doesn't)

    Quote Originally Posted by acppdummy
    I will be happy to check the preprocessed output if I figure out how to do it.
    You can check the online GCC manual, or even just do a g++ --help to read up on options like -E and -S.

    Anyway, even if you do check it, the bug remains in your code.
    C + C++ Compiler: MinGW port of GCC
    Build + Version Control System: SCons + Bazaar

    Look up a C/C++ Reference and learn How To Ask Questions The Smart Way
    Kindly rate my posts if you found them useful

  11. #11
    Join Date
    Oct 2006
    Location
    Sweden
    Posts
    3,654

    Re: Compiler auto correct uninitialized variable? (gcc does and VC++ doesn't)

    Debugging is twice as hard as writing the code in the first place.
    Therefore, if you write the code as cleverly as possible, you are, by
    definition, not smart enough to debug it.
    - Brian W. Kernighan

    To enhance your chance's of getting an answer be sure to read
    http://www.codeguru.com/forum/announ...nouncementid=6
    and http://www.codeguru.com/forum/showthread.php?t=366302 before posting

    Refresh your memory on formatting tags here
    http://www.codeguru.com/forum/misc.php?do=bbcode

    Get your free MS compiler here
    https://visualstudio.microsoft.com/vs

  12. #12
    Join Date
    Jun 2009
    Location
    France
    Posts
    2,513

    Re: Compiler auto correct uninitialized variable? (gcc does and VC++ doesn't)

    Quote Originally Posted by acppdummy View Post
    Note that in the 2nd case the value of the char dlm should be set only when the if statement holds true, otherwise it should be undefined. However g++ seems to either treat it as a static variable or simply change the logic (otherwise how come the binaries are identical?). I even wonder if there might be a bug in g++?
    No:
    "in the 2nd case the value of the char dlm will be set [s]only[/s] when the if statement holds true".
    "otherwise it should be undefined": "undefined" is not a special spectral value that crashes a program. It just means the compiler is free to do anything it wants to the variable, including also setting it when the statement does not hold true.

    Long story short, the only restraint a compiler really has is to preserve the observed defined behavior. For example this program:

    Code:
    void setstr(){
        char dlm='/';
        if (strlen(str1)==0){
            sprintf(str1,"drive/f/dir1");;
        }
        sprintf(str2,"%s%c%s",str1,dlm,"name1");
    }
    Creates the same observable behavior when the first if is true, so the compiler is free to reinterpret your program as such.

    GCC did not change the logic. It just had a "different interpretation" of the logic.
    Last edited by monarch_dodra; July 15th, 2012 at 04:32 PM.
    Is your question related to IO?
    Read this C++ FAQ article at parashift by Marshall Cline. In particular points 1-6.
    It will explain how to correctly deal with IO, how to validate input, and why you shouldn't count on "while(!in.eof())". And it always makes for excellent reading.

  13. #13
    Join Date
    Nov 2010
    Posts
    105

    Lightbulb The optimizer made the difference!

    Hello all, Thank you very much for the comments. I learned a lot from this forum as always. Here is what I found:

    Using the -S switch I let g++ produce assembly codes for each of the two functions. Then I BeyondCompared them and quoted the full report below. The good algorithm is on the left hand side and the bad algorithm on the right. Note the assignment of the '/' character ($47) is outside of the if statement (jne L2) for the good case and inside for the bad case, just as one would expect from the C code. So the compiler seems to do a faithful translation of the C code to assembly code. But how can the different assembly codes end up as identical object files?

    Code:
    Text Compare
    Produced: 2012-07-16 08:23:31
    
    Mode:  All, With Context
    Left file: F:\NetBeansProjects_Win\testUninitializedVar\function2.s
    Right file: F:\NetBeansProjects_Win\testUninitializedVar\function.s
        .file    "function2.cpp"                      <>     .file    "function.cpp"
    ------------------------------------------------------------------------
        .section .rdata,"dr"                       =      .section .rdata,"dr"
    LC0:                                           LC0:
        .ascii "drive/f/dir1\0"                           .ascii "drive/f/dir1\0"
    LC1:                                           LC1:
        .ascii "name1\0"                                  .ascii "name1\0"
    LC2:                                           LC2:
        .ascii "%s%c%s\0"                                 .ascii "%s%c%s\0"
        .text                                             .text
    .globl __Z6setstrv                             .globl __Z6setstrv
        .def    __Z6setstrv;    .scl    2;    .type    32;    .endef        .def    __Z6setstrv;    .scl    2;    .type    32;    .endef
    __Z6setstrv:                                   __Z6setstrv:
    LFB7:                                          LFB7:
        pushl    %ebp                                        pushl    %ebp
    LCFI0:                                         LCFI0:
        movl    %esp, %ebp                                   movl    %esp, %ebp
    LCFI1:                                         LCFI1:
        subl    $56, %esp                                    subl    $56, %esp
    LCFI2:                                         LCFI2:
    ------------------------------------------------------------------------
        movb    $47, -9(%ebp)                         +-
    ------------------------------------------------------------------------
        movl    $_str1, %eax                          =      movl    $_str1, %eax
        movzbl    (%eax), %eax                               movzbl    (%eax), %eax
        testb    %al, %al                                    testb    %al, %al
        jne    L2                                            jne    L2
        movl    $13, 8(%esp)                                 movl    $13, 8(%esp)
        movl    $LC0, 4(%esp)                                movl    $LC0, 4(%esp)
        movl    $_str1, (%esp)                               movl    $_str1, (%esp)
        call    _memcpy                                      call    _memcpy
    ------------------------------------------------------------------------
                                                -+     movb    $47, -9(%ebp)
    ------------------------------------------------------------------------
    L2:                                         =  L2:
    ------------------------------------------------------------------------
                                                -+     movsbl    -9(%ebp), %eax
    ------------------------------------------------------------------------
        movl    $LC1, 16(%esp)                        =      movl    $LC1, 16(%esp)
    ------------------------------------------------------------------------
        movl    $47, 12(%esp)                         <>     movl    %eax, 12(%esp)
    ------------------------------------------------------------------------
        movl    $_str1, 8(%esp)                       =      movl    $_str1, 8(%esp)
        movl    $LC2, 4(%esp)                                movl    $LC2, 4(%esp)
        movl    $_str2, (%esp)                               movl    $_str2, (%esp)
        call    _sprintf                                     call    _sprintf
        leave                                             leave
    LCFI3:                                         LCFI3:
        ret                                               ret
    LFE7:                                          LFE7:
        .def    _memcpy;    .scl    2;    .type    32;    .endef            .def    _memcpy;    .scl    2;    .type    32;    .endef
        .def    _sprintf;    .scl    2;    .type    32;    .endef           .def    _sprintf;    .scl    2;    .type    32;    .endef
    ------------------------------------------------------------------------
    Then I remembered that I was using an IDE to build my projects so I went back and looked and saw it using -O2 option among others. So I redid the -S together with -O2 and sure enough this time around the assembly codes are identical except the imbedded filenames. See full comparison report below. The optimizer seems really smart: it sees a local variable (dlm) that is assigned a value only once within scope, so it just replaces it with a constant, resulting in exactly the same assembly code. This also turns the bad algorithm into a good one. I am so amazed!

    Code:
    Text Compare
    Produced: 2012-07-16 08:21:13
    
    Mode:  All, With Context
    Left file: d:\NetBeansProjects_Win\testUninitializedVar\function2.s
    Right file: d:\NetBeansProjects_Win\testUninitializedVar\function.s
        .file    "function2.cpp"                      <>     .file    "function.cpp"
    ------------------------------------------------------------------------
        .section .rdata,"dr"                       =      .section .rdata,"dr"
    LC0:                                           LC0:
        .ascii "name1\0"                                  .ascii "name1\0"
    LC1:                                           LC1:
        .ascii "%s%c%s\0"                                 .ascii "%s%c%s\0"
        .text                                             .text
        .p2align 4,,15                                    .p2align 4,,15
    .globl __Z6setstrv                             .globl __Z6setstrv
        .def    __Z6setstrv;    .scl    2;    .type    32;    .endef        .def    __Z6setstrv;    .scl    2;    .type    32;    .endef
    __Z6setstrv:                                   __Z6setstrv:
    LFB7:                                          LFB7:
        pushl    %ebp                                        pushl    %ebp
    LCFI0:                                         LCFI0:
        movl    %esp, %ebp                                   movl    %esp, %ebp
    LCFI1:                                         LCFI1:
        subl    $40, %esp                                    subl    $40, %esp
    LCFI2:                                         LCFI2:
        cmpb    $0, _str1                                    cmpb    $0, _str1
        jne    L2                                            jne    L2
        movl    $1986622052, _str1                           movl    $1986622052, _str1
        movl    $795225957, _str1+4                          movl    $795225957, _str1+4
        movl    $829581668, _str1+8                          movl    $829581668, _str1+8
        movb    $0, _str1+12                                 movb    $0, _str1+12
    L2:                                            L2:
        movl    $LC0, 16(%esp)                               movl    $LC0, 16(%esp)
        movl    $47, 12(%esp)                                movl    $47, 12(%esp)
        movl    $_str1, 8(%esp)                              movl    $_str1, 8(%esp)
        movl    $LC1, 4(%esp)                                movl    $LC1, 4(%esp)
        movl    $_str2, (%esp)                               movl    $_str2, (%esp)
        call    _sprintf                                     call    _sprintf
        leave                                             leave
    LCFI3:                                         LCFI3:
        ret                                               ret
    LFE7:                                          LFE7:
        .def    _sprintf;    .scl    2;    .type    32;    .endef           .def    _sprintf;    .scl    2;    .type    32;    .endef
    ------------------------------------------------------------------------
    Last edited by acppdummy; July 16th, 2012 at 07:54 AM.

  14. #14
    Join Date
    Apr 1999
    Posts
    27,449

    Re: The optimizer made the difference!

    Quote Originally Posted by acppdummy View Post
    The optimizer seems really smart: it sees a local variable (dlm) that is assigned a value only once within scope, so it just replaces it with a constant, resulting in exactly the same assembly code. This also turns the bad algorithm into a good one. I am so amazed!
    And what if that uninitialized variable was supposed to be initialized to a different character, but the programmer forgot to do so? Would the optimizer acting this way be smart? Your situation is just luck.

    Please don't rely on uninitialized variables to "auto-correct" themselves. What you have is a plain-old bug that would have needed to be fixed. Looking at assembly language is usually used when programs that are well-defined and must behave a certain way do not behave a certain way -- it is hardly if ever used to figure out why undefined behaviour is what it is.

    Regards,

    Paul McKenzie

  15. #15
    Join Date
    Nov 2010
    Posts
    105

    Re: Compiler auto correct uninitialized variable? (gcc does and VC++ doesn't)

    Thanks! Point well taken - I already fixed the code when VC++ did not do the "smart optimization" and exposed the bug. The exercise of looking at the assembly code was just to understand why g++ produced the same binary from two apparently quite different algorithms (when I first found the binary did not change after the code fix). Thanks again!

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured