My program is a TCP client which manipulates lots of socket connection concurrently.
One socket for each thread.
I use blocking mode at most of time, except creating a new connection to server.
I turn socket into non-blocking mode first, then use select() to check the socket before turning back.
I really don't want to waste time on uncertain blocking timeout and make sure the connection is established in the specific time.
But here comes the question.
When the number of unreachable server increases(more than 5), the rest reachable server will fail on creating connection.
select() will return 0(means timeout) even though the server is originally reachable.
Here is the portion of my program to create connection:
Open(LPCTSTR addr, int port, DWORD dwTimeout)
{
_socket = 0;
SOCKET theSocket;
int nRet;
// Store information about the server
LPHOSTENT lpHostEntry;
lpHostEntry = gethostbyname(addr); // Specifying server by its name
if (lpHostEntry == NULL) {
return;
}
// Create the socket
theSocket = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
if (theSocket == INVALID_SOCKET) {
return;
}
// Set socket to non-blocking mode
unsigned long ul = 1;
nRet = ioctlsocket(theSocket, FIONBIO, (unsigned long*)&ul);
if (nRet == SOCKET_ERROR) {
closesocket(theSocket);
return;
}
// Use SOCKADDR_IN to fill in address information
SOCKADDR_IN saServer;
saServer.sin_family = AF_INET;
saServer.sin_addr = *((LPIN_ADDR)*lpHostEntry->h_addr_list);
// ^ Address of the server being inserted into the address field
saServer.sin_port = htons(port);
// Connect to the server
nRet = connect(theSocket,
(LPSOCKADDR)&saServer, // Server address
sizeof(struct sockaddr)); // Length of address structure
if (nRet == SOCKET_ERROR && WSAGetLastError() != WSAEWOULDBLOCK) {
closesocket(theSocket);
return;
}
// Set socket to blocking mode
ul= 0;
nRet = ioctlsocket(theSocket, FIONBIO, (unsigned long*)&ul);
if (nRet == SOCKET_ERROR){
closesocket(theSocket);
return;
}
_socket = theSocket;
}
Is there anything wrong in this code?
Or I just shouldn't use select() in multithread?
Best regards,
Luder
hoxsiew
January 27th, 2010, 07:30 AM
Please use code tags. It's difficult to follow, but I'm curious as to why you are doing both setting the socket to non-blocking, and also using select. The socket is supposed to block, at least for the duration of your timeout in the select call.
luderjane
January 27th, 2010, 09:27 PM
I'm sorry for the bad readability and repost it with code tags as follows:
Open(LPCTSTR addr, int port, DWORD dwTimeout)
{
_socket = 0;
SOCKET theSocket;
int nRet;
// Store information about the server
LPHOSTENT lpHostEntry;
lpHostEntry = gethostbyname(addr); // Specifying server by its name
if (lpHostEntry == NULL) {
return;
}
// Create the socket
theSocket = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
if (theSocket == INVALID_SOCKET) {
return;
}
// Set socket to non-blocking mode
unsigned long ul = 1;
nRet = ioctlsocket(theSocket, FIONBIO, (unsigned long*)&ul);
if (nRet == SOCKET_ERROR) {
closesocket(theSocket);
return;
}
// Use SOCKADDR_IN to fill in address information
SOCKADDR_IN saServer;
saServer.sin_family = AF_INET;
saServer.sin_addr = *((LPIN_ADDR)*lpHostEntry->h_addr_list);
// ^ Address of the server being inserted into the address field
saServer.sin_port = htons(port);
// Connect to the server
nRet = connect(theSocket,
(LPSOCKADDR)&saServer, // Server address
sizeof(struct sockaddr)); // Length of address structure
if (nRet == SOCKET_ERROR && WSAGetLastError() != WSAEWOULDBLOCK) {
closesocket(theSocket);
return;
}
// Set socket to blocking mode
ul= 0;
nRet = ioctlsocket(theSocket, FIONBIO, (unsigned long*)&ul);
if (nRet == SOCKET_ERROR){
closesocket(theSocket);
return;
}
_socket = theSocket;
}
According to MSDN: http://msdn.microsoft.com/en-us/library/ms737625(VS.85).aspx
"With a nonblocking socket, the connection attempt cannot be completed immediately. In this case, connect will return SOCKET_ERROR, and WSAGetLastError will return WSAEWOULDBLOCK."
Then I use select() to determine the completion of the connection(scenario 1).
In this case, I can control the timeout.
What I'm confused is why the unreachable connection attempt will effect the other reachable connection.
Is there any constraint on the number of unreachable socket select() concurrently in multithread?
I can't believe the most is 5 or 6......
Richard.J
January 28th, 2010, 03:02 PM
it seems as if you are calling select() with just a single socket. If you do so, maybe the multiple calls to select() are screwing up your system? I would not expect so, but who knows?
hoxsiew
January 28th, 2010, 04:04 PM
I'm sorry for the bad readability and repost it with code tags as follows:
According to MSDN: http://msdn.microsoft.com/en-us/library/ms737625(VS.85).aspx (http://msdn.microsoft.com/en-us/library/ms737625%28VS.85%29.aspx)
"With a nonblocking socket, the connection attempt cannot be completed immediately. In this case, connect will return SOCKET_ERROR, and WSAGetLastError will return WSAEWOULDBLOCK."
Then I use select() to determine the completion of the connection(scenario 1).
In this case, I can control the timeout.
What I'm confused is why the unreachable connection attempt will effect the other reachable connection.
Is there any constraint on the number of unreachable socket select() concurrently in multithread?
I can't believe the most is 5 or 6......
Sorry, I thought your snippit was for a read routine (code tags sure help). Connect is different and it appears you are doing it correctly, so I'm not sure where your problem is coming from. Perhaps it is in some other part of the code. Where is _socket coming from? Could it be a synchronization issue, or possibly a race condition where different threads are trying to access _socket?
luderjane
February 1st, 2010, 01:05 AM
_socket is a SOCKET member of object which controls the connection between my program and some TCP server.
There is only one _socket in each thread
And each thread can only access its members.
So, there is no synchronization issue on the _socket.
luderjane
February 2nd, 2010, 01:12 AM
I wrote a simple testbed to exclude other factor in my program as follows(builds in VC2005, must depends on Ws2_32.lib):
//
// Use SOCKADDR_IN to fill in address information
//
SOCKADDR_IN saServer;
saServer.sin_family = AF_INET;
saServer.sin_addr = *((LPIN_ADDR)*lpHostEntry->h_addr_list);
// ^ Address of the server being inserted into the address field
saServer.sin_port = htons(pServer->iPort);
//
// Connect to the server
//
DWORD dwStartTime = GetTickCount();
nRet = connect(theSocket,
(LPSOCKADDR)&saServer, // Server address
sizeof(struct sockaddr)); // Length of address structure
if (pServer->bPrintable)
{
if (nRet == SOCKET_ERROR)
{
printf("%s:%d blocking open failed takes %d ms\n", pServer->sIP, pServer->iPort,
GetTickCount()-dwStartTime);
}
else
{
printf("%s:%d blocking open successful takes %d ms\n", pServer->sIP, pServer->iPort,
GetTickCount()-dwStartTime);
}
}
closesocket(theSocket);
}
void NonBlockingOpen(pServerInfo pServer)
{
SOCKET theSocket;
int nRet;
//
// Store information about the server
//
LPHOSTENT lpHostEntry;
lpHostEntry = gethostbyname(pServer->sIP); // Specifying server by its name
if (lpHostEntry == NULL) {
return;
}
// Set socket to non-blocking mode
//
unsigned long ul = 1;
nRet = ioctlsocket(theSocket, FIONBIO, (unsigned long*)&ul);
if (nRet == SOCKET_ERROR) {
closesocket(theSocket);
return;
}
// Use SOCKADDR_IN to fill in address information
//
SOCKADDR_IN saServer;
saServer.sin_family = AF_INET;
saServer.sin_addr = *((LPIN_ADDR)*lpHostEntry->h_addr_list);
// ^ Address of the server being inserted into the address field
saServer.sin_port = htons(pServer->iPort);
// Connect to the server
//
nRet = connect(theSocket,
(LPSOCKADDR)&saServer, // Server address
sizeof(struct sockaddr)); // Length of address structure
This testbed simply creates multiple thread which opens a socket repeatedly.
There is only one reachable server and the rest are unreachable.
When I use non-blocking mode(timeout = 5000) and 2 unreachable servers, the open procedure takes 0 ms at most of time.
If I gradually increase the number of unreachable server and keep the same timeout, the open procedure takes more and more time.
In the end(about 16 unreachable servers), the open procedure failed every time(except first 5 times).
Even in blocking mode(timeout = 0), the opening time consuming will increase with the number of unreachable server.
There seems to be a constraint on the socket connection in multithread.
But I cannot google any official document about it...
hoxsiew
February 2nd, 2010, 07:54 AM
Well, I'm at a loss. I ran your routine with every variation I could think of and it works fine. I modified it slightly and used my local network for the unreachable part (10.x.x.x in my case) and used 209.191.93.53 (one of yahoo's web servers) for the reachable host and it all goes well. Maybe I'm missing something. Maybe your "reachable" host is doing something weird.
Richard.J
February 2nd, 2010, 08:02 AM
Could it be a problem of a router that is in the network? Maybe that is blocking requests because it tries to resolve the addresses of the unreachable servers?
hoxsiew
February 2nd, 2010, 08:24 AM
I tried a scenario similar to that (with bogus addresses that could not be reached) and it still handled them fine.
luderjane
February 2nd, 2010, 09:59 PM
I was confused to see you guys run the testbed well in the beginning.
Then, it occurs to me that I forgot to explain I test it on WinXP pro sp3.
So, I try the same scenario on Vista home premium sp1 and Win7 pro.
Everything works fine!
I think it's maybe a bug in WinXP and solved after Vista.
But, if your OS is WinXP, please let me know.
hoxsiew
February 3rd, 2010, 07:51 AM
I'm on windows 7 at the moment, but when I get a chance, I'll test it on WinXP (SP3).
codeguru.com
Copyright Internet.com Inc., All Rights Reserved.