-
January 16th, 2012, 11:21 PM
#1
[RESOLVED] Winsock fails after 1 hour
Hi forum,
We have a client and Server application, connects via Winsock.
The Setup:
My server is running on a Windows 2003 machine and my client is running on a windows XP
Problem:
After exactly 1 hour, my client closes with the socket error param code = 10053
My server too errors out at the following line
result = WaitForMultipleObjects(object_count,wait_objects,FALSE,210000);
UNFORTUNATELY : I do not have a direct access to the machine and so cant debug the same.
The Above information I got is from our trace file
NOTE : This is failing in only one setup. we have 1000's of people usng our product in around 100's of these setup's including 2003 , 2008 xp windows 7 etc
This problem is only seen with one particular customer and he is able to reproduce it 100% of his time.
Search for the param 10053 reveals many causes, So can someone elaborate why is this happening only to one customer?
Thanks in anticipation
N.B:
1. Event log doesnt reveal anything suspicious
2. When the customer installs the client in another windows 7 system with the server running on the same (Win 2003 system) this problem goes away.
Last edited by softmessager; February 14th, 2012 at 11:18 PM.
-
January 16th, 2012, 11:36 PM
#2
Re: Winsock fails after 1 hour
Originally Posted by softmessager
So can someone elaborate why is this happening only to one customer?
It's happening because your program has bugs.
Unless we have your program and source code, there is no way anyone can answer your question with anything but guessing.
1) If you're using multiple threads, a synchronization issue?
2) Are you checking for all return codes for all functions that return error codes, instead of assuming that functions are successful?
I have seen programmers call API or third-party library functions with the false confidence that they always work. Then they get that one computer where the functions fail, and the code erroneously takes the "successful" path of execution, causing all sorts of issues. Only after changing the code to test the return value is it then realized that the function(s) have failed.
3) A general, good old-fashioned bug due to errors in programming. Things like uninitialized variables, memory overwrites, etc.
And as far as this happening to only one customer, that is the nature of C++ programming -- if your program has any bugs, it doesn't matter if the program has worked on thousands of computers or for many years. It is that one or two customers who unfortunately will see the bug manifest itself.
Regards,
Paul McKenzie
-
January 17th, 2012, 01:36 AM
#3
Re: Winsock fails after 1 hour
Thanks for the reply
yes I am checking the return codes and the program closes gracefully.
There is no crash, just the program closes on either side.
socket error param code = 10053 in the client side
Fails at WaitForMultipleObjects returns socket error as well.
-
January 17th, 2012, 02:02 AM
#4
Re: Winsock fails after 1 hour
Originally Posted by softmessager
...
Problem:
After exactly 1 hour, my client closes with the socket error param code = 10053
My server too errors out at the following line
result = WaitForMultipleObjects(object_count,wait_objects,FALSE,210000);
Well, this "code snippet" doesn't help us to understand which (exactly!) line of your code causes the error 10053.
And FYI, MSDN description of this error is
WSAECONNABORTED
10053
Software caused connection abort.
An established connection was aborted by the software in your host computer, possibly due to a data transmission time-out or protocol error.
So at this place I agree with Paul:
Originally Posted by Paul McKenzie
Unless we have your program and source code, there is no way anyone can answer your question with anything but guessing
Victor Nijegorodov
-
January 17th, 2012, 03:30 AM
#5
Re: Winsock fails after 1 hour
Thanks for all the replies ...
I will be posting the code.
Meanwhile what are the external reasons that the windows socket can fail?
-- coz the time out for this error is Exactly 1 hour...
Steps,
1. Start the communication between the client and server
2. Wait for 1 hour
3. Start communication again
3. Socket errors Client and server closes
However, if there is some communication between the client and server within the 1 hour then the program runs fine.
THIS SEQUENCE doesnt fail
1. Start the communication
2. Wait for 55 ~ 58 minutes
3. Communicate - communication is successful
THIS SEQUENCE doesnt fail either
1. Start the communication
2. Wait for 55 ~ 58 minutes
3. Communicate - communication is successful
4. Wait for another hour
5. Communication doesnt fail.
Thanks in anticipation
Regards
-
January 17th, 2012, 04:31 AM
#6
Re: Winsock fails after 1 hour
Originally Posted by softmessager
Thanks for all the replies ...
I will be posting the code.
Meanwhile what are the external reasons that the windows socket can fail?
-- coz the time out for this error is Exactly 1 hour...
Shouldn't your software have much more diagnostics than just a trace file, given the fact you are dealing with sockets and communication? Maybe the computer is running an app that closes the socket, maybe their TCP/IP settings have a role in this, etc.
The point being that you should have had most, if not all of these bases covered. Where is the information on the TCP/IP settings? Do you collect that information? Maybe the computer puts the network card to sleep after 1 hour and never wakes up (for whatever reason).
I could go on about it, but if you're going to distribute a program to the public that relies on TCP/IP, that program can't just be an ordinary application with just a trace file to help you out.
Regards,
Paul McKenzie
-
January 19th, 2012, 02:14 AM
#7
Re: Winsock fails after 1 hour
Thanks paul,
Can you tell me what do you mean by more information other than trace ?
Information about TCP/IP settings etc.
What are these information and how can I collect them?
or do you mean to ask about the settings that i use while opening a socket?
Can you please elaborate a bit from what you just Quoted ?
You said : "if you're going to distribute a program to the public that relies on TCP/IP, that program can't just be an ordinary application with just a trace file to help you out."
Can you please elaborate on the above?.
Other than the server side trace, we have a client trace that gives us information about the packets received and sent.
Thanks
-
January 19th, 2012, 04:55 AM
#8
Re: Winsock fails after 1 hour
Originally Posted by softmessager
"if you're going to distribute a program to the public that relies on TCP/IP, that program can't just be an ordinary application with just a trace file to help you out."
Can you please elaborate on the above?.
Well, your situation makes the point clear, doesn't it? A trace file is not enough information for you to solve the problem.
Again, what if it's a setting on their network card that's causing this or a third-party software causing the issue? What if the computer's power options or some other software puts the NIC to sleep after 1 hour, and the NIC doesn't wake up for some reason (maybe the LAN driver is buggy)?
In other words, you need more information on the environment -- network card, version and model, settings, any third-party software running (virus checkers, Internet security suite programs such as Norton, Avast), etc.
I'm no great expert in socket programming (my expertise is in scanners and image acquisition devices), but this is no different than any other type of programming that relies on hardware that can have multitudes of settings where you don't know what the customer may have enabled/disabled/tweaked etc. You need as much information as you can regarding the hardware and software that is being accessed.
Regards,
Paul McKenzie
-
January 19th, 2012, 05:18 PM
#9
Re: Winsock fails after 1 hour
A shot in the dark: Is your program using setsockopt() with SO_KEEPALIVE? The timing of the keep alive packet is a registry entry, and normally is set to two hours. If you see this behavior at only one installation, then perhaps the registry entry has been over-ridden and set to only one hour.
Search for KeepAliveTime on this page, to find the precise registry entry: http://support.microsoft.com/kb/120642/EN-US
Of course, this is not the answer unless you see identical behavior from other installations, but at the two hour mark instead of the one hour mark.
Mike
-
January 19th, 2012, 08:35 PM
#10
Re: Winsock fails after 1 hour
Originally Posted by softmessager
NOTE : This is failing in only one setup. we have 1000's of people usng our product in around 100's of these setup's including 2003 , 2008 xp windows 7 etc
This problem is only seen with one particular customer and he is able to reproduce it 100% of his time.
If Mike's explanation solves your issue, then the quote from you above doesn't sound too promising.
A product that has been running on thousands of machines and on different OSes, and no one at your end who maintains the socket communication knew anything about this setting? This is where your software needs to write to the trace file all of the settings. You can start here, and add to the trace all of these settings you see here:
http://msdn.microsoft.com/en-us/libr...=vs.85%29.aspx
Even if you don't understand those settings, write the values of those settings to your traces. Then when you get the trace, you see exactly what you're dealing with. Yes, there could be many standard settings, but if you really do have a product that you distribute to customers, it is your responsibility to have this information.
This is exactly how my company's software deals with scanners -- in our trace file, we record practically everything about the device our software is communicating with. My software calls a series of functions when running in "diagnostic mode" that retrieves all of the information about the devices we're communicating with. Otherwise, we have a deficient product and would not be ready when that odd scanner used by some far-flung customer doesn't work correctly with our software.
Regards,
Paul McKenzie
Last edited by Paul McKenzie; January 20th, 2012 at 04:21 AM.
-
January 20th, 2012, 07:53 PM
#11
Re: Winsock fails after 1 hour
Interesting. It never occurred to me to write registry values out to a trace file. But it is a really good idea, and sound advice too.
Mike
-
January 20th, 2012, 08:56 PM
#12
Re: Winsock fails after 1 hour
Originally Posted by MikeAThon
Interesting. It never occurred to me to write registry values out to a trace file. But it is a really good idea, and sound advice too.
Especially if there are literally hundreds of different manufacturers, models, drivers, etc. you're communicating with, whether it be NIC, scanners, etc. The only way to get any kind of control debugging issues such as the OP has encountered is to gather as much information as possible about hardware, settings, driver, etc.
With our software, we write the version, manufacturer, etc. of the scanner driver being used, and any capabilities of the device we're dealing with. We even have simple diagnostic programs that must be run before running our software to determine if there will be any compatibility problems, and if there are, work them out from there.
Regards,
Paul McKenzie
Last edited by Paul McKenzie; January 20th, 2012 at 08:58 PM.
-
January 22nd, 2012, 10:20 PM
#13
Re: Winsock fails after 1 hour
Hi Paul and Mike
We are grateful for the responses.
I am post my existing winsock settings that I have acquired.
trace file created:
Type = 1 Buff = 4096 Prot = 4 Timeout = 20 Retries = 1
Successful socket initialization. WinSock version 1.1
Description: 'WinSock 2.0'
WINSOCK STATUS
System Status: 'Running'
The IP address 192.39.0.104 will be used for the host name 'CPNTA'
WinSock send and receive buffer sizes: 8192/8192
Socket connected successfully!
SCRIPT WRITE DATA
<00>
STARTING READ LOOKING FOR:
SOCKET WRITEABLE
Data Socket is Writeable!
SCRIPT READ DATA
SCRIPT WRITE DATA
Some Internal stuff calling exe Process etc
STARTING READ LOOKING FOR
SCRIPT READ DATA
<03>05048 05048
SCRIPT WRITE DATA
XXXX XXXXXXX <00>
STARTING READ LOOKING FOR:
>>>
SCRIPT READ DATA
SCRIPT WRITE DATA
XXXXX ,,<00>
SCRIPT COMPLETED
Process_Command: Script Processing completed<00>
..... after the 1 hour some communication happens back anf forth ...
The last message from my client to server is the
printer has just been enabled
ATTN_KEEPALIVE (keepalive sent)
After which there is no response from server and after 22 seconds the client trace says
WINSOCK
Server closed data socket! WSAGETSELECTERROR msg param code = 10053
WINSOCK
Client socket closed!
Pardon my Ignorance
By Additional information you mean, I should use the getsockopt functions for getting those options and putting them down into my trace file am i right?
One more question : How will I get other settings like hardware card whether its gone to sleep, suspended etc ?
Adding trace file information hmm .. I WILL GET THIS INCLUDED In our next build. Thanks paul for that
Thanks for the registry information as well
I will ask them to export and send me the registry settings at
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services from both his Windows 2003 server as well as his windows XP machine.
Another interesting thing is that once he upgraded to windows 7 this problem got away.
Usually its the other way round for some of our customers LOL
To enumerate my present course of action Just in case it would help anyone who is in the same situation as I am in
What I can do right now to get the resolution is to create a simple exe that gets specific network information
E.g. here at
http://tangentsoft.net/wskfaq/examples/getifaces.html
... plus any information that i get as reply from this thread.
Ask the customer to execute the program before the issue and after this issue
Thanks in anticipation
-
January 22nd, 2012, 11:23 PM
#14
Re: Winsock fails after 1 hour
.. sorry for the repost that i did here
There was some problem with the browser.
Last edited by softmessager; January 22nd, 2012 at 11:24 PM.
Reason: Reposted the same article. Deleted the same.
-
January 23rd, 2012, 02:22 AM
#15
Re: Winsock fails after 1 hour
Originally Posted by softmessager
By Additional information you mean, I should use the getsockopt functions for getting those options and putting them down into my trace file am i right?
Yes. Even options you think may not give you additional information in solving this particular problem, put them in the traces.
One more question : How will I get other settings like hardware card whether its gone to sleep, suspended etc ?
Not sure, not a Windows NIC expert. Someone should be able to help you, as the OS is able to get this information.
Regards,
Paul McKenzie
Tags for this Thread
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|