|
-
August 16th, 2012, 09:04 PM
#4
Re: Simple proxy server in C# - Need help with relay logic
In your thread, you check for the URL of the target webserver only once, at the beginning of receipt of data from the client. This corresponds to your observation that the inital URL (like google) opens fine. Eventually, however, the client is going to ask for a different target webserver, which corresponds to clicking on a link from the google results page. However, since you are not checking for a new URL, your proxy continues to communicate with the initial target and not with the newly-requested target.
To solve the issue, you must understand how HTTP protocol is implemented, and you must in particular understand persistent connections in HTTP/1.1. At some point your proxy must parse the actual HTTP protocol in order to understand what is being requested, and to recognize when a different URL is being requested. In response to a request for a different URL, the proxy must close the connection with the existing webserver, and open a new connection to the newly-requested URL.
Frankly, persistent connections in HTTP/1.1 are difficult for proxies to handle, and ordinarily require your proxy to maintain some sort of state machine to know when to close a connection with an existing target webserver, and open a connection to the URL of a new target webserver.
If performance is not an issue, the usual solution (which simplifies logic tremendously) is to override persistent connections (which is the default mode for HTTP/1.1). To do so, at the end of the HTTP header, simply append the command line Connection: Close\r\n before the proxy sends the HTTP request from the proxy to the target webserver. This will cause the target to close its connection with the proxy after the target has finished its response. In addition, when the proxy is forwarding the response from the target to the client, similarly append Connection: Close\r\n to the HTTP header, and when the target closes its connection with the proxy (signifying the end of the response), similarly close the proxy's connection to the client after all data is sent.
The result is a "downgrade" to HTTP/1.0 where the default was non-persistent connections. Under HTTP/1.0, each request by the client to the server required a brand new connection, yes that's right, a brand new call to connect for each and every request to the server. Think about that for a second: for a client to display even a simple page like th google home page, the client needs to make several separate requests: one for the basic text, one for the css, one for each of the images and logos/icons, and so on. Under HTTP/1.0, a brand new connection was required for each of those requests. This was simple, but obviously very inefficient, and under HTTP/1.0 eventually led to the unofficial header of Connection: Keep-alive. This unofficial header became very popular under HTTP/1.0, and even though it was undocumented in the RFC, was widely used to increase efficiency by promoting persistent connections. Compare these two RFCs:
RFC2616: Hypertext Transfer Protocol -- HTTP/1.1 at many locations such as http://www.rfc-editor.org/rfc/rfc2616.txt
RFC1945: Hypertext Transfer Protocol -- HTTP/1.0 at many locations such as http://www.rfc-editor.org/rfc/rfc1945.txt
Incidentally, you might not enjoy reading these RFCs, but if you are programming a proxy, you will need to.
Finally, if you are using IE as your browser, you can make a quick test to determine if this is your problem. Under Tools->Internet Options->Advanced, scroll down until you see HTTP/1.1 Settings. Uncheck "Use HTTP/1.1 through proxy connections" which is otherwise selected by default. This will cause IE to use HTTP/1.0 when it makes requests to your proxy. Try connecting to google through your proxy again, and you might have better success when clicking on the google search results.
Another thing to try is to connect to google and get your search results, but then wait for a minute or two before clicking on a result. By that time, the google webserver will sense inactivity on the connection, and will have closed the connection.
Best of luck,
Mike
Tags for this Thread
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|