How can I track down this nasty bug?
CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 8 of 8

Thread: How can I track down this nasty bug?

  1. #1
    Join Date
    Jul 2011
    Posts
    15

    How can I track down this nasty bug?

    I'll try to make this as succinct as possible.

    The background: I have a MSVC 10 solution with a client project and server project. The client process runs on one (local) host and the server process runs on several remote hosts. I can access the remote hosts only via Windows Remote Desktop. All the hosts in this system are Windows Server 2003.

    The problem: The client/server solution works fine except that once in a great while, a server process causes the remote host on which it is running to crash. I can't access a crashed host when this happens because Remote Desktop is unresponsive. Despite putting a ludicrous amount of error handling and debugging statements in the server code, I can't identify any specific problems with the server process up to or during the crash. All I know is that it will suddenly and unpredictably cause the host to completely freeze up. There are no hardware or software errors related to the crash reported in the Windows Event Viewer after I restart the host. It happens randomly on different hosts and none of these have hard drive, networking, or RAM errors as far as tests show.

    The question: What can I do in this situation? This is the worst bug I've ever encountered because I'm completely in the dark as to what's happening during these crashes and there doesn't seem to be any way to get more information about what's going on. Does anyone have any ideas for what I can do? Thanks very much for any help~

  2. #2
    Join Date
    Apr 1999
    Posts
    27,446

    Re: How can I track down this nasty bug?

    If you're application truly crashes (you get the "your application has crashed -- do you want to send to Microsoft..." dialog), then you should investigate producing a crash dump for your program.

    There are many articles on how to produce crash dumps. Look for "minidump", "minidumper", dbghelp.dll, etc. When the application crashes, you will get a crash dump file you load into Visual Studio and then you "run" the dump file to figure out what happened. If you build the code using PDB file, then the dump file will stop at the line in the source where the error occurred (or approximately the line, if you've optimized the code).
    Despite putting a ludicrous amount of error handling and debugging statements in the server code, I can't identify any specific problems with the server process up to or during the crash.
    That's what a crash dump file is supposed to alleviate you from -- hunting and pecking where a crash occurs.

    Regards,

    Paul McKenzie

  3. #3
    Join Date
    Jul 2011
    Posts
    15

    Re: How can I track down this nasty bug?

    OK, thanks Paul. I don't think it's actually crashing in that sense, because wouldn't Dr. Watson or one of those dialog boxes asking me to debug appear on restart? Assuming it's not crashing like that, what else do you think I could do? It would be great to just wrap the whole server in a sandbox that monitors what its doing. Would that mean using some sort of advanced Windows techniques [http://www.amazon.com/dp/0321374460/] to write my own debugging version of the program?

  4. #4
    Join Date
    Apr 1999
    Posts
    27,446

    Re: How can I track down this nasty bug?

    Quote Originally Posted by andrew732 View Post
    OK, thanks Paul. I don't think it's actually crashing in that sense, because wouldn't Dr. Watson or one of those dialog boxes asking me to debug appear on restart?
    Well, you can test by writing a very small release version of an application, run it, and see what happens:
    Code:
    int main()
    {
       char *p = 0;
      *p = 'x';
    }
    If you compile this in release (not debug) mode, and run it on the machine that you're running your current program, what happens?

    Whatever that is, that is what you get when a program crashes. It is not a guarantee that Dr. Watson shows up, since various registry settings may have been changed, or maybe Dr. Watson will not show up, or another debugger has control (for example, if you install Visual Studio), etc.

    I have seen programs just disappear into thin air due to a crash, and nothing showing up except a terminated application.
    Assuming it's not crashing like that
    There is only one way to crash, and that is what I've described. A crash means only one thing, and that is an application terminating abnormally.

    If it isn't crashing, then it's either in an infinite loop, or some sort of "waiting" state, or the program exited normally, but with some sort of error code or error condition being returned. These are not crashes, so you need to clarify if your program is indeed crashing or the program is still running but in some sort of loop, or the program did terminate gracefully but with an error condition you didn't expect.

    Regards,

    Paul McKenzie
    Last edited by Paul McKenzie; October 8th, 2011 at 03:23 PM.

  5. #5
    Join Date
    Jul 2011
    Posts
    15

    Re: How can I track down this nasty bug?

    OK, thanks again Paul. I tried your crash program and it doesn't appear to do anything at all when run on the servers. It just opens the black dos window for a fraction of a second and then closes. There's no Dr. Watson, no "Sorry for the inconvenience ...", or anything else. That's weird because I've seen Dr. Watson run on the servers when other programs crash, and when I run your crash program on my local machine it causes the "Sorry for the inconvenience ..." dialog to come up. I guess crashing is much more complicated than I realized.

    Is there any way to run my server program in debug mode on these remote machines even if they don't have MSVC installed? Would it do any good even if I could? I'm still trying to figure out a way to get even a small hint about what's going on when this bug causes these remote machines to freeze.

  6. #6
    Join Date
    Apr 1999
    Posts
    27,446

    Re: How can I track down this nasty bug?

    Quote Originally Posted by andrew732 View Post
    OK, thanks again Paul. I tried your crash program and it doesn't appear to do anything at all when run on the servers. It just opens the black dos window for a fraction of a second and then closes. There's no Dr. Watson, no "Sorry for the inconvenience ...", or anything else. That's weird because I've seen Dr. Watson run on the servers when other programs crash, and when I run your crash program on my local machine it causes the "Sorry for the inconvenience ..." dialog to come up. I guess crashing is much more complicated than I realized.
    I suspected as such. On the machine where you see just the black OS window show up, I would assume the compiler is not installed on that system.

    So your machine is basically "debugerless". Just to let you know, if you did have a crash dump set up, you would see that your crash dump routine would replace the window flash, and on your other machine, it would replace the "Sorry for the inconvenience" dialog. That's the beauty of the crash dump. It doesn't matter if the machine has Dr. Watson or not. If you did a google search on "minidumper" "crash dump", etc. you should come across some source code that allows you to intercept and run your own routine, including creating a crash dump file.
    Is there any way to run my server program in debug mode on these remote machines even if they don't have MSVC installed?
    Yes, it's called "remote debugging".

    You need these things to invoke this:

    1) A TCP/IP connection to the server machine from your build/develop machine (IP address is OK, as well as any valid web address to the server).

    2) The server must run the MSVSMON.EXE program. This is the debugging agent between the server and your develop machine.

    3) Your application, built with debugging info (you can build release versions with debugging info -- just go to the debug options for your project and turn on debugging).

    Then all you do is install the app on the server, and then on your build machine, connect to the server machine in Visual Studio via the debugging project option. You go in the Debugging options for your project and set up remote debugging (IP address of the machine, how the program is run on the machine, whether you want to attach to a running process, etc.). The server machine need not have Visual Studio installed, or have access to your source code -- all of that stays on your build machine. To get your feet wet, you set a breakpoint in some function you know will get called, and voila, you'll see that breakpoint is hit if you've got everything set up right.

    The above is too much for me to get into, as it is documented in your Visual C++ manual, but yes, this is how you debug apps on a remote machine from your develop machine. When you first get it working, you wonder how you could have gone on for so long without knowing about this. You're actually running the application on the machine that has the issue, and not simulating running it on your build machine or some other machine.

    Regards,

    Paul McKenzie
    Last edited by Paul McKenzie; October 11th, 2011 at 09:52 PM.

  7. #7
    Join Date
    Jul 2011
    Posts
    15

    Re: How can I track down this nasty bug?

    Great, thanks a lot Paul. Crash dumps and the remote debugger give me something to work with now, and hopefully I can track this down.

  8. #8
    Join Date
    Apr 1999
    Posts
    27,446

    Re: How can I track down this nasty bug?

    Quote Originally Posted by andrew732 View Post
    Great, thanks a lot Paul. Crash dumps and the remote debugger give me something to work with now, and hopefully I can track this down.
    No problem.

    I would try the remote debugging first. You would run the server app from your machine and just let it crash -- the debugger will pop up on your machine when the crash on the server happens. The crash dump won't be necessary to solve your immediate problem (it's good if you come across further problems later on down the road).

    Regards,

    Paul McKenzie

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Windows Mobile Development Center


Click Here to Expand Forum to Full Width

This is a CodeGuru survey question.


Featured


HTML5 Development Center