CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 15 of 15

Threaded View

  1. #10
    Join Date
    Nov 2002
    Location
    California
    Posts
    4,556

    Re: Download a file using http & winsock

    Yes, you're correct: It's because it's an HTTP/1.1 server.

    You must include the "Host:" header, to allow an HTTP/1.1 server to find the resource. An HTTP/1.0 server will ignore it. Sorry to have given you inaccurate advice. You learn something new every day.

    As background, under HTTP/1.1, a client must, at a minimum:

    1. include the Host: header with each request
    2. accept responses with chunked data
    3. either support persistent connections, or include the "Connection: close" header with each request
    4. handle the "100 Continue" response

    The last three can be difficult to program, which is why I suggested that you use HTTP/1.0 instead.

    However, when an HTTP/1.1 server receives an HTTP/1.0 request, it might not know how to find the resource. HTTP/1.1 servers are multi-homed, meaning that the the address www.server1.com and www.server2.com might point to the exact same IP address. That's why HTTP/1.1 requires the "Host:" header.

    Here is the pertinent portion of the HTTP/1.1 spec that details the action of an HTTP/1.1 server that receives an HTTP/1.0 request (from ftp://ftp.rfc-editor.org/in-notes/rfc2616.txt ). The last paragraph is the important one:
    Quote Originally Posted by rfc 2616 section 5.2
    5.2 The Resource Identified by a Request

    The exact resource identified by an Internet request is determined by
    examining both the Request-URI and the Host header field.

    An origin server that does not allow resources to differ by the
    requested host MAY ignore the Host header field value when
    determining the resource identified by an HTTP/1.1 request. (But see
    section 19.6.1.1 for other requirements on Host support in HTTP/1.1.)

    An origin server that does differentiate resources based on the host
    requested (sometimes referred to as virtual hosts or vanity host
    names) MUST use the following rules for determining the requested
    resource on an HTTP/1.1 request:

    1. If Request-URI is an absoluteURI, the host is part of the
    Request-URI. Any Host header field value in the request MUST be
    ignored.

    2. If the Request-URI is not an absoluteURI, and the request includes
    a Host header field, the host is determined by the Host header
    field value.

    3. If the host as determined by rule 1 or 2 is not a valid host on
    the server, the response MUST be a 400 (Bad Request) error message.

    Recipients of an HTTP/1.0 request that lacks a Host header field MAY
    attempt to use heuristics (e.g., examination of the URI path for
    something unique to a particular host) in order to determine what
    exact resource is being requested.
    Note the very last part, which explains what an HTTP/1.1 server should do when confronted with a Host-less request.

    Here, for your program, when you modified the request down to a single line of
    Code:
    GET /path/filemane HTTP/1.0 <CRLF>
    <CRLF>
    the HTTP/1.1 server had no way of telling which host it should get the resource from. So it returned "404 Not Found".

    Under the bolded section of 5.2 (above), when you included the full URI in the request:
    Code:
    GET http://server/path/filemane HTTP/1.0 <CRLF>
    <CRLF>
    then the HTTP/1.1 server could apply "heurisitcs" to find the resource you were asking for. However, this format of request might be non-compliant with HTTP/1.0, which states that the full URI should only be used with a proxy. The solution is to include the "Host:" header, which HTTP/1.0 servers will ignore, and which will allow HTTP/1.1 servers to disambiguate the resource:
    Code:
    GET /path/filemane HTTP/1.0 <CRLF>
    Host: server <CRLF>
    <CRLF>
    So, you code should look like this:
    Code:
    // ....
    printf("%s","\nNow connected to ");
    printf("%s",argv[1]);
    printf(snsn," via port 80");
    request+="GET ";  // not needed // http://";
    // not needed // request+=argv[1];
    request+=argv[2];
    request+=" HTTP/1.0";
    request+=&lb[2];
    request+="Host: ";  // "Host:" is necessary
    request+=argv[1];
    request+=lb;
    printf(snsn,"\nHTTP request constructed successfully:\n");
    
    // ....
    I have tested this code and it works.

    Incidentally, for a very approachable tutorial on writing code for HTTP clients and servers, I recommend "HTTP Made Really Easy: A Practical Guide to Writing Clients and Servers" by James Marshall at http://www.jmarshall.com/easy/http/

    Mike
    Last edited by MikeAThon; November 25th, 2006 at 02:19 PM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured