Click to See Complete Forum and Search --> : Testing CSockets
RogerGarrett
August 9th, 2005, 05:12 PM
I've been working with CSockets for several months now and mostly have things working, but am having an odd symptom occasionally while testing.
I'm using the Visual Studio IDE to do the testing. I've written a client program and a server program. I run both of them at the same time under Visual Studio and set the client program to make multiple connections to the server program to retrieve data from the server. It all works pretty well.
Except that occasionally it locks up. I can see that the client program is about to do the CSocket.Connect. The host program then successfully performs an Accept, sets up the necessary CArchives to receive the information (actually an identifier string that tells the host what data the client wants) from the client. The fact that the host has processed the Accept is a clear indicator that the client has completed the Connect, but the client actually never gets back from the call to Connect, so it never gets to the point in the code where it sends the identifier string to the host. So it locks up.
Since I'm running both the client and host on the same machine it means that it's time-slicing the processing of the two programs. It's almost as if, after handling the Accept in the host program, the operating system never gets around again to processing the client program.
Does this make sense? Is it possible that it's a timesharing problem on the machine and that the programs are actually OK?
I don't at present have a second machine so that I could run the host and client on separate machines.
Has anyone encountered such a problem? Is there a solution?
- Roger
NigelQ
August 10th, 2005, 11:46 PM
Roger,
My answer is probably not going to help you, but should reassure you that handling "time-slicing" as you called in on the same machine is fine. I do this all of the time without noticing anything wierd (other than the wierdness I create myself, that is).
Now that I've cleared that up, here's a couple of things I have seen:
You are obviously using the host PC as both the client and server for your communications, however you will notice different things going on depending on how you specify the server's connection address (IP address).
There are three things that go on to resolve the address during connection:
1) Connect to "computer_name". This computer name gets resolved to the actual IP address, then the connection is made at the port you specify.
2) Connect to "127.0.0.1". This is the special 'internal' network address of the machine. Note that you need a valid physical network connection to use this. This is the same as "localhost" (from your HOSTS file, assuming it hasn't been hacked).
3) Connect to "loopback". This is the special address that the network driver uses to simply connect to the same computer without hitting the network. You do NOT need a valid phyical network connection to use this (although you probably need a network card installed)
So, if in doubt, try using "loopback" as your server address to remove any possible routing or network issues that may be going on.
Another thing worth considering is the number and rate of connections you are attempting to the server. If it is possible that more than 5 clients are attempting to connect at one time (more than 5 simultaneously attempting new connections), then the listening thread cannot handle this, and connection attempts after the fifth will be refused (try again in a little while in this case).
Also, the biggest issue I've seen when trying to debug problems like this are software firewalls. Now don't get me wrong, I couldn't sleep without one (or two), but they can certainly mess things up occasionally...
It may be worth unplugging yourself from the Internet and disabling your software firewalls for a while to see if that makes a difference.
Beyond this, I have not seen this specific issue, but on the other hand, I've not had a situation where I couldn't see what was going on with sufficient breakpoints and debug statements...
Again, not that this comment will help, but I've moved away from using CSocket derived communications because of the documented bugs they have, and also when you hit problems like the one you describe, you don't have far to go before you're deep into unknown code.
Hope this helps,
- Nigel
MikeAThon
August 11th, 2005, 07:43 PM
Are your programs GUI programs (i.e., with a message loop) or console programs? CSocket will work well only if there's a message loop running.
Mike
RogerGarrett
August 17th, 2005, 01:36 PM
Mike,
Yes the programs are GUI applications. They run fine for a while, connecting, transferring data, and disconnecting. But at some point it locks up, as described in the original post.
Nigel,
Thank you for the insights. I have since been able to acquire another computer, also connected to the Internet, and run the server app on one and the client app on the other. That seems to alleviate the problem, so there does seem to be something weird going on when they both runon the same computer.
- Roger
NigelQ
August 17th, 2005, 02:06 PM
Roger,
Good to hear you're making progress.
Mike brought up a good and valid point, the CSocket implementations require a message loop to function, which means that all sockets typically share the same parent window message loop (unless you direct them otherwise).
This is very deep inside the CSocket implementation, and can cause lost messages during very busy times (1000+ messages at any one time).
A lot of this has to do with the blocking nature of sockets and the way in which Microsoft implemented CSocket. It's handling of blocking events (such as you describe) does not work every time, resulting in deadlock situations.
It was for these reasons that I previously moved away from using CSocket based communication, using just regular sockets directly. I recall it was painful for me to make the move, because it involved writing a lot of stuff that CSocket already does, but it has paid off time and again (sorry I can't send you my socket wrapper class, but there are some freely available online).
There was a page describing *some* of the problems associated with the CSocket implementation that I couldn't find when I answered last time, but have since found it:
http://tangentsoft.net/wskfaq/articles/csocket.html
It makes for an interesting read.
I note that you seem to be developing a client/server type application, so you should really consider the stability of CSocket within such a design.
If you really want to continue using CSocket, there are some Microsoft articles that try to address the various issues, such as this one:
http://support.microsoft.com/kb/q138692/
I think the fact that you've not seen the problem after switching to using two computers is likely due to the reduced stress on the CSocket, but I think this is likely just masking the real problem which is still there (just not as obvious now).
Hope this helps,
- Nigel
MikeAThon
August 17th, 2005, 02:22 PM
...I have since been able to acquire another computer, also connected to the Internet, and run the server app on one and the client app on the other. That seems to alleviate the problem, so there does seem to be something weird going on when they both runon the same computer.
In the client/server socket world, that's somewhat unusual. Typically, programmers find that their client/server programs work perfectly when both client and server are on the same machine (and communicate via localhost), and then break when moved to separate machines (communicating via Internet).
I tend to agree with NigelQ's observation that the real problem is still there.
Mike
codeguru.com
Copyright Internet.com Inc., All Rights Reserved.