[RESOLVED] Winsock fails after 1 hour
CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Page 1 of 2 12 LastLast
Results 1 to 15 of 17

Thread: [RESOLVED] Winsock fails after 1 hour

  1. #1
    Join Date
    Apr 2005
    Posts
    125

    [RESOLVED] Winsock fails after 1 hour

    Hi forum,

    We have a client and Server application, connects via Winsock.

    The Setup:
    My server is running on a Windows 2003 machine and my client is running on a windows XP

    Problem:
    After exactly 1 hour, my client closes with the socket error param code = 10053
    My server too errors out at the following line
    result = WaitForMultipleObjects(object_count,wait_objects,FALSE,210000);

    UNFORTUNATELY : I do not have a direct access to the machine and so cant debug the same.
    The Above information I got is from our trace file

    NOTE : This is failing in only one setup. we have 1000's of people usng our product in around 100's of these setup's including 2003 , 2008 xp windows 7 etc
    This problem is only seen with one particular customer and he is able to reproduce it 100% of his time.

    Search for the param 10053 reveals many causes, So can someone elaborate why is this happening only to one customer?

    Thanks in anticipation

    N.B:
    1. Event log doesnt reveal anything suspicious
    2. When the customer installs the client in another windows 7 system with the server running on the same (Win 2003 system) this problem goes away.
    Last edited by softmessager; February 14th, 2012 at 10:18 PM.

  2. #2
    Join Date
    Apr 1999
    Posts
    27,427

    Re: Winsock fails after 1 hour

    Quote Originally Posted by softmessager View Post
    So can someone elaborate why is this happening only to one customer?
    It's happening because your program has bugs.

    Unless we have your program and source code, there is no way anyone can answer your question with anything but guessing.

    1) If you're using multiple threads, a synchronization issue?

    2) Are you checking for all return codes for all functions that return error codes, instead of assuming that functions are successful?

    I have seen programmers call API or third-party library functions with the false confidence that they always work. Then they get that one computer where the functions fail, and the code erroneously takes the "successful" path of execution, causing all sorts of issues. Only after changing the code to test the return value is it then realized that the function(s) have failed.

    3) A general, good old-fashioned bug due to errors in programming. Things like uninitialized variables, memory overwrites, etc.

    And as far as this happening to only one customer, that is the nature of C++ programming -- if your program has any bugs, it doesn't matter if the program has worked on thousands of computers or for many years. It is that one or two customers who unfortunately will see the bug manifest itself.

    Regards,

    Paul McKenzie

  3. #3
    Join Date
    Apr 2005
    Posts
    125

    Re: Winsock fails after 1 hour

    Thanks for the reply
    yes I am checking the return codes and the program closes gracefully.

    There is no crash, just the program closes on either side.
    socket error param code = 10053 in the client side
    Fails at WaitForMultipleObjects returns socket error as well.

  4. #4
    VictorN's Avatar
    VictorN is offline Super Moderator Power Poster
    Join Date
    Jan 2003
    Location
    Wallisellen (ZH), Switzerland
    Posts
    17,360

    Re: Winsock fails after 1 hour

    Quote Originally Posted by softmessager View Post
    ...
    Problem:
    After exactly 1 hour, my client closes with the socket error param code = 10053
    My server too errors out at the following line
    result = WaitForMultipleObjects(object_count,wait_objects,FALSE,210000);
    Well, this "code snippet" doesn't help us to understand which (exactly!) line of your code causes the error 10053.

    And FYI, MSDN description of this error is
    WSAECONNABORTED
    10053
    Software caused connection abort.
    An established connection was aborted by the software in your host computer, possibly due to a data transmission time-out or protocol error.
    So at this place I agree with Paul:
    Quote Originally Posted by Paul McKenzie
    Unless we have your program and source code, there is no way anyone can answer your question with anything but guessing
    Victor Nijegorodov

  5. #5
    Join Date
    Apr 2005
    Posts
    125

    Re: Winsock fails after 1 hour

    Thanks for all the replies ...

    I will be posting the code.

    Meanwhile what are the external reasons that the windows socket can fail?
    -- coz the time out for this error is Exactly 1 hour...
    Steps,
    1. Start the communication between the client and server
    2. Wait for 1 hour
    3. Start communication again
    3. Socket errors Client and server closes
    However, if there is some communication between the client and server within the 1 hour then the program runs fine.

    THIS SEQUENCE doesnt fail
    1. Start the communication
    2. Wait for 55 ~ 58 minutes
    3. Communicate - communication is successful

    THIS SEQUENCE doesnt fail either
    1. Start the communication
    2. Wait for 55 ~ 58 minutes
    3. Communicate - communication is successful
    4. Wait for another hour
    5. Communication doesnt fail.

    Thanks in anticipation
    Regards

  6. #6
    Join Date
    Apr 1999
    Posts
    27,427

    Re: Winsock fails after 1 hour

    Quote Originally Posted by softmessager View Post
    Thanks for all the replies ...

    I will be posting the code.

    Meanwhile what are the external reasons that the windows socket can fail?
    -- coz the time out for this error is Exactly 1 hour...
    Shouldn't your software have much more diagnostics than just a trace file, given the fact you are dealing with sockets and communication? Maybe the computer is running an app that closes the socket, maybe their TCP/IP settings have a role in this, etc.

    The point being that you should have had most, if not all of these bases covered. Where is the information on the TCP/IP settings? Do you collect that information? Maybe the computer puts the network card to sleep after 1 hour and never wakes up (for whatever reason).

    I could go on about it, but if you're going to distribute a program to the public that relies on TCP/IP, that program can't just be an ordinary application with just a trace file to help you out.

    Regards,

    Paul McKenzie

  7. #7
    Join Date
    Apr 2005
    Posts
    125

    Re: Winsock fails after 1 hour

    Thanks paul,

    Can you tell me what do you mean by more information other than trace ?

    Information about TCP/IP settings etc.
    What are these information and how can I collect them?
    or do you mean to ask about the settings that i use while opening a socket?

    Can you please elaborate a bit from what you just Quoted ?
    You said : "if you're going to distribute a program to the public that relies on TCP/IP, that program can't just be an ordinary application with just a trace file to help you out."

    Can you please elaborate on the above?.

    Other than the server side trace, we have a client trace that gives us information about the packets received and sent.

    Thanks

  8. #8
    Join Date
    Apr 1999
    Posts
    27,427

    Re: Winsock fails after 1 hour

    Quote Originally Posted by softmessager View Post
    "if you're going to distribute a program to the public that relies on TCP/IP, that program can't just be an ordinary application with just a trace file to help you out."

    Can you please elaborate on the above?.
    Well, your situation makes the point clear, doesn't it? A trace file is not enough information for you to solve the problem.

    Again, what if it's a setting on their network card that's causing this or a third-party software causing the issue? What if the computer's power options or some other software puts the NIC to sleep after 1 hour, and the NIC doesn't wake up for some reason (maybe the LAN driver is buggy)?

    In other words, you need more information on the environment -- network card, version and model, settings, any third-party software running (virus checkers, Internet security suite programs such as Norton, Avast), etc.

    I'm no great expert in socket programming (my expertise is in scanners and image acquisition devices), but this is no different than any other type of programming that relies on hardware that can have multitudes of settings where you don't know what the customer may have enabled/disabled/tweaked etc. You need as much information as you can regarding the hardware and software that is being accessed.

    Regards,

    Paul McKenzie

  9. #9
    Join Date
    Nov 2002
    Location
    California
    Posts
    4,553

    Re: Winsock fails after 1 hour

    A shot in the dark: Is your program using setsockopt() with SO_KEEPALIVE? The timing of the keep alive packet is a registry entry, and normally is set to two hours. If you see this behavior at only one installation, then perhaps the registry entry has been over-ridden and set to only one hour.

    Search for KeepAliveTime on this page, to find the precise registry entry: http://support.microsoft.com/kb/120642/EN-US

    Of course, this is not the answer unless you see identical behavior from other installations, but at the two hour mark instead of the one hour mark.

    Mike

  10. #10
    Join Date
    Apr 1999
    Posts
    27,427

    Re: Winsock fails after 1 hour

    Quote Originally Posted by softmessager View Post
    NOTE : This is failing in only one setup. we have 1000's of people usng our product in around 100's of these setup's including 2003 , 2008 xp windows 7 etc
    This problem is only seen with one particular customer and he is able to reproduce it 100% of his time.
    If Mike's explanation solves your issue, then the quote from you above doesn't sound too promising.

    A product that has been running on thousands of machines and on different OSes, and no one at your end who maintains the socket communication knew anything about this setting? This is where your software needs to write to the trace file all of the settings. You can start here, and add to the trace all of these settings you see here:

    http://msdn.microsoft.com/en-us/libr...=vs.85%29.aspx

    Even if you don't understand those settings, write the values of those settings to your traces. Then when you get the trace, you see exactly what you're dealing with. Yes, there could be many standard settings, but if you really do have a product that you distribute to customers, it is your responsibility to have this information.

    This is exactly how my company's software deals with scanners -- in our trace file, we record practically everything about the device our software is communicating with. My software calls a series of functions when running in "diagnostic mode" that retrieves all of the information about the devices we're communicating with. Otherwise, we have a deficient product and would not be ready when that odd scanner used by some far-flung customer doesn't work correctly with our software.

    Regards,

    Paul McKenzie
    Last edited by Paul McKenzie; January 20th, 2012 at 03:21 AM.

  11. #11
    Join Date
    Nov 2002
    Location
    California
    Posts
    4,553

    Re: Winsock fails after 1 hour

    Interesting. It never occurred to me to write registry values out to a trace file. But it is a really good idea, and sound advice too.

    Mike

  12. #12
    Join Date
    Apr 1999
    Posts
    27,427

    Re: Winsock fails after 1 hour

    Quote Originally Posted by MikeAThon View Post
    Interesting. It never occurred to me to write registry values out to a trace file. But it is a really good idea, and sound advice too.
    Especially if there are literally hundreds of different manufacturers, models, drivers, etc. you're communicating with, whether it be NIC, scanners, etc. The only way to get any kind of control debugging issues such as the OP has encountered is to gather as much information as possible about hardware, settings, driver, etc.

    With our software, we write the version, manufacturer, etc. of the scanner driver being used, and any capabilities of the device we're dealing with. We even have simple diagnostic programs that must be run before running our software to determine if there will be any compatibility problems, and if there are, work them out from there.

    Regards,

    Paul McKenzie
    Last edited by Paul McKenzie; January 20th, 2012 at 07:58 PM.

  13. #13
    Join Date
    Apr 2005
    Posts
    125

    Re: Winsock fails after 1 hour

    Hi Paul and Mike

    We are grateful for the responses.

    I am post my existing winsock settings that I have acquired.
    trace file created:
    Type = 1 Buff = 4096 Prot = 4 Timeout = 20 Retries = 1
    Successful socket initialization. WinSock version 1.1
    Description: 'WinSock 2.0'
    WINSOCK STATUS
    System Status: 'Running'
    The IP address 192.39.0.104 will be used for the host name 'CPNTA'
    WinSock send and receive buffer sizes: 8192/8192
    Socket connected successfully!
    SCRIPT WRITE DATA
    <00>
    STARTING READ LOOKING FOR:
    SOCKET WRITEABLE
    Data Socket is Writeable!
    SCRIPT READ DATA
    SCRIPT WRITE DATA
    Some Internal stuff calling exe Process etc
    STARTING READ LOOKING FOR

    SCRIPT READ DATA
    <03>05048 05048
    SCRIPT WRITE DATA
    XXXX XXXXXXX <00>
    STARTING READ LOOKING FOR:
    >>>
    SCRIPT READ DATA
    SCRIPT WRITE DATA
    XXXXX ,,<00>
    SCRIPT COMPLETED
    Process_Command: Script Processing completed<00>

    ..... after the 1 hour some communication happens back anf forth ...

    The last message from my client to server is the
    printer has just been enabled
    ATTN_KEEPALIVE (keepalive sent)

    After which there is no response from server and after 22 seconds the client trace says
    WINSOCK
    Server closed data socket! WSAGETSELECTERROR msg param code = 10053
    WINSOCK
    Client socket closed!

    Pardon my Ignorance
    By Additional information you mean, I should use the getsockopt functions for getting those options and putting them down into my trace file am i right?

    One more question : How will I get other settings like hardware card whether its gone to sleep, suspended etc ?

    Adding trace file information hmm .. I WILL GET THIS INCLUDED In our next build. Thanks paul for that

    Thanks for the registry information as well
    I will ask them to export and send me the registry settings at
    HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services from both his Windows 2003 server as well as his windows XP machine.

    Another interesting thing is that once he upgraded to windows 7 this problem got away.
    Usually its the other way round for some of our customers LOL


    To enumerate my present course of action Just in case it would help anyone who is in the same situation as I am in
    What I can do right now to get the resolution is to create a simple exe that gets specific network information
    E.g. here at
    http://tangentsoft.net/wskfaq/examples/getifaces.html
    ... plus any information that i get as reply from this thread.

    Ask the customer to execute the program before the issue and after this issue

    Thanks in anticipation

  14. #14
    Join Date
    Apr 2005
    Posts
    125

    Re: Winsock fails after 1 hour

    .. sorry for the repost that i did here
    There was some problem with the browser.
    Last edited by softmessager; January 22nd, 2012 at 10:24 PM. Reason: Reposted the same article. Deleted the same.

  15. #15
    Join Date
    Apr 1999
    Posts
    27,427

    Re: Winsock fails after 1 hour

    Quote Originally Posted by softmessager View Post
    By Additional information you mean, I should use the getsockopt functions for getting those options and putting them down into my trace file am i right?
    Yes. Even options you think may not give you additional information in solving this particular problem, put them in the traces.
    One more question : How will I get other settings like hardware card whether its gone to sleep, suspended etc ?
    Not sure, not a Windows NIC expert. Someone should be able to help you, as the OS is able to get this information.

    Regards,

    Paul McKenzie

Page 1 of 2 12 LastLast

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Azure Activities Information Page

Windows Mobile Development Center


Click Here to Expand Forum to Full Width

This is a CodeGuru survey question.


Featured


HTML5 Development Center