-
August 13th, 2012, 10:38 AM
#1
Simple proxy server in C# - Need help with relay logic
Hi. I'm trying to write a simple and minimalist Http proxy server that can on the command line. In Start() method, a simple TcpListener blocks until it gets a client request and creates a new thread (ThreadHandleClient method) that processes this client, fetches its url and relays data.
The trouble is with the relay logic. What happens is that in the proxy-client (browser), I type a url (www.google.com) which opens fine. Then I perform a keyword search which also goes fine. However, when I click on a search-result, I'm again seeing the google.com page, (though my browser url text still shows the result page). I think this is due to my relay logic. Somehow, the client socket is still receiving the old request instead of new one. Can you help me find out what is wrong it?
Thanks in advance. This is my code:
Code:
public void Start(IPAddress ip, int port)
{
TcpListener listener = new TcpListener(ip, port);
listener.Start(100);
while (!stopFlag)
{
Socket client = listener.AcceptSocket();
IPEndPoint rep = (IPEndPoint)client.RemoteEndPoint;
clients.Add(rep.Address.ToString());
Thread th = new Thread(ThreadHandleClient);
th.Start(client);
}
listener.Stop();
}
public void ThreadHandleClient(object o)
{
try
{
Socket client = (Socket)o;
NetworkStream ns = new NetworkStream(client);
//RECEIVE CLIENT DATA
byte[] buffer = new byte[2048];
int rec = 0, sent = 0, transferred = 0, rport = 0;
string data = "";
do
{
rec = ns.Read(buffer, 0, buffer.Length);
data += Encoding.ASCII.GetString(buffer, 0, rec);
} while (rec == buffer.Length);
//PARSE DESTINATION AND SEND REQUEST
string line = data.Replace("\r\n", "\n").Split(new string[] { "\n" }, StringSplitOptions.None)[0];
Uri uri = new Uri(line.Split(new string[] { " " }, StringSplitOptions.None)[1]);
if (uri.Scheme == "https")
{
rport = 443;
//rq = HttpVersion + " 200 Connection established\r\nProxy-Agent: Prahlad`s Proxy Server\r\n\r\n";
//ClientSocket.BeginSend(Encoding.ASCII.GetBytes(rq), 0, rq.Length, SocketFlags.None, new AsyncCallback(this.OnOkSent), ClientSocket);
}
else
{
rport = 80;
}
IPHostEntry rh = Dns.GetHostEntry(uri.Host);
Socket webserver = new Socket(rh.AddressList[0].AddressFamily, SocketType.Stream, ProtocolType.IP);
webserver.Connect(new IPEndPoint(rh.AddressList[0], rport));
byte[] databytes = Encoding.ASCII.GetBytes(data);
webserver.Send(databytes, databytes.Length, SocketFlags.None);
//START RELAY
buffer = new byte[2048];
rec = 0;
data = "";
do
{
transferred = 0;
do
{
rec = webserver.Receive(buffer, buffer.Length, SocketFlags.None);
sent = client.Send(buffer, rec, SocketFlags.None);
transferred += rec;
//data += Encoding.ASCII.GetString(serverbytes, 0, rec);
} while (rec == buffer.Length);
if (transferred == 0)
break;
transferred = 0;
do
{
rec = client.Receive(buffer, buffer.Length, SocketFlags.None);
sent = webserver.Send(buffer, sent, SocketFlags.None);
transferred += rec;
} while (rec == buffer.Length);
} while (transferred > 0);
}
catch (Exception ex)
{
System.Diagnostics.Debug.Print("Error occured: " + ex.Message);
}
}
-
August 13th, 2012, 01:29 PM
#2
Re: Simple proxy server in C# - Need help with relay logic
Check out the Socket:LingerState property.
-
August 13th, 2012, 02:06 PM
#3
Re: Simple proxy server in C# - Need help with relay logic
Originally Posted by Arjay
Check out the Socket:LingerState property.
I saw this MSDN article that says LingerState property allows to delay while closing a socket, in an attempt to send all pending data. In what way will this be useful to my situation?
-
August 16th, 2012, 09:04 PM
#4
Re: Simple proxy server in C# - Need help with relay logic
In your thread, you check for the URL of the target webserver only once, at the beginning of receipt of data from the client. This corresponds to your observation that the inital URL (like google) opens fine. Eventually, however, the client is going to ask for a different target webserver, which corresponds to clicking on a link from the google results page. However, since you are not checking for a new URL, your proxy continues to communicate with the initial target and not with the newly-requested target.
To solve the issue, you must understand how HTTP protocol is implemented, and you must in particular understand persistent connections in HTTP/1.1. At some point your proxy must parse the actual HTTP protocol in order to understand what is being requested, and to recognize when a different URL is being requested. In response to a request for a different URL, the proxy must close the connection with the existing webserver, and open a new connection to the newly-requested URL.
Frankly, persistent connections in HTTP/1.1 are difficult for proxies to handle, and ordinarily require your proxy to maintain some sort of state machine to know when to close a connection with an existing target webserver, and open a connection to the URL of a new target webserver.
If performance is not an issue, the usual solution (which simplifies logic tremendously) is to override persistent connections (which is the default mode for HTTP/1.1). To do so, at the end of the HTTP header, simply append the command line Connection: Close\r\n before the proxy sends the HTTP request from the proxy to the target webserver. This will cause the target to close its connection with the proxy after the target has finished its response. In addition, when the proxy is forwarding the response from the target to the client, similarly append Connection: Close\r\n to the HTTP header, and when the target closes its connection with the proxy (signifying the end of the response), similarly close the proxy's connection to the client after all data is sent.
The result is a "downgrade" to HTTP/1.0 where the default was non-persistent connections. Under HTTP/1.0, each request by the client to the server required a brand new connection, yes that's right, a brand new call to connect for each and every request to the server. Think about that for a second: for a client to display even a simple page like th google home page, the client needs to make several separate requests: one for the basic text, one for the css, one for each of the images and logos/icons, and so on. Under HTTP/1.0, a brand new connection was required for each of those requests. This was simple, but obviously very inefficient, and under HTTP/1.0 eventually led to the unofficial header of Connection: Keep-alive. This unofficial header became very popular under HTTP/1.0, and even though it was undocumented in the RFC, was widely used to increase efficiency by promoting persistent connections. Compare these two RFCs:
RFC2616: Hypertext Transfer Protocol -- HTTP/1.1 at many locations such as http://www.rfc-editor.org/rfc/rfc2616.txt
RFC1945: Hypertext Transfer Protocol -- HTTP/1.0 at many locations such as http://www.rfc-editor.org/rfc/rfc1945.txt
Incidentally, you might not enjoy reading these RFCs, but if you are programming a proxy, you will need to.
Finally, if you are using IE as your browser, you can make a quick test to determine if this is your problem. Under Tools->Internet Options->Advanced, scroll down until you see HTTP/1.1 Settings. Uncheck "Use HTTP/1.1 through proxy connections" which is otherwise selected by default. This will cause IE to use HTTP/1.0 when it makes requests to your proxy. Try connecting to google through your proxy again, and you might have better success when clicking on the google search results.
Another thing to try is to connect to google and get your search results, but then wait for a minute or two before clicking on a result. By that time, the google webserver will sense inactivity on the connection, and will have closed the connection.
Best of luck,
Mike
-
August 17th, 2012, 02:20 PM
#5
Re: Simple proxy server in C# - Need help with relay logic
Thanks Mike for the excellent post. This was precisely what I was looking for. Once I realized that I am checking the url only in the beginning, all pieces came to join themselves. As a quick fix, I removed the second part of my relay loop, so that only client-bound traffic coming from web-server will be relayed. Any new connection (such as the url clicked on google-result page) should create a new connection from listener.AcceptSocket(). The new code looks like this:
transferred = 0;
do
{
rec = webserver.Receive(buffer, buffer.Length, SocketFlags.None);
sent = client.Send(buffer, rec, SocketFlags.None);
transferred += rec;
//data += Encoding.ASCII.GetString(serverbytes, 0, rec);
} while (rec == buffer.Length);
if (transferred == 0)
break;
transferred = 0;
do
{
rec = client.Receive(buffer, buffer.Length, SocketFlags.None);
sent = webserver.Send(buffer, sent, SocketFlags.None);
transferred += rec;
} while (rec == buffer.Length);
I also realized that I had forgotton to close the TCP sockets (client and webserver variables). Once I did that at the end of the main loop, the proxy server now seems to be working fine.
Thanks for pointing me to the issues involved with HTTP 1.0/1.1 and the rfc docs. I guess a more rigorous study is needed here. I had assumed it will be a simple transparent proxy that can just relay traffic between a client and a webserver, but to put more features and performance in this product I will have to study that.
-
August 17th, 2012, 02:20 PM
#6
Re: Simple proxy server in C# - Need help with relay logic
Thanks Mike for the excellent post. This was precisely what I was looking for. Once I realized that I am checking the url only in the beginning, all pieces came to join themselves. As a quick fix, I removed the second part of my relay loop, so that only client-bound traffic coming from web-server will be relayed. Any new connection (such as the url clicked on google-result page) should create a new connection from listener.AcceptSocket(). The new code looks like this:
Code:
transferred = 0;
do
{
rec = webserver.Receive(buffer, buffer.Length, SocketFlags.None);
sent = client.Send(buffer, rec, SocketFlags.None);
transferred += rec;
//data += Encoding.ASCII.GetString(serverbytes, 0, rec);
} while (rec == buffer.Length);
if (transferred == 0)
break;
//transferred = 0;
//do
//{
// rec = client.Receive(buffer, buffer.Length, SocketFlags.None);
// sent = webserver.Send(buffer, sent, SocketFlags.None);
// transferred += rec;
//} while (rec == buffer.Length);
I also realized that I had forgotton to close the TCP sockets (client and webserver variables). Once I did that at the end of the main loop, the proxy server now seems to be working fine.
Thanks for pointing me to the issues involved with HTTP 1.0/1.1 and the rfc docs. I guess a more rigorous study is needed here. I had assumed it will be a simple transparent proxy that can just relay traffic between a client and a webserver, but to put more features and performance in this product I will have to study that.
Last edited by prahladyeri; August 17th, 2012 at 02:27 PM.
Tags for this Thread
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|