CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 6 of 6
  1. #1
    Join Date
    Jun 2009
    Posts
    40

    getting html source code with c++

    I am writing a program that looks through source code of websites and pulls information out of it, but I don't know how to download the source codes so my file can parse them. I don't really want to use socket programing from the ground up, so does anyone have a library that is pretty well documented with Visual C++ that can do the work with a few simple functions.

    Thanks

  2. #2
    Join Date
    Jan 2009
    Posts
    1,689

    Re: getting html source code with c++

    You always get source code, C++ doesn't change the html in any way. Download the library libcurl, from there it is very easy to download anything that you want.

    Code:
    std::string buffer;
    CURL * curl = curl_easy_init();
    curl_easy_setopt(curl, CURLOPT_URL, "http://www.google.com");
    curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, somecallback);
    curl_easy_setopt(curl, CURLOPT_WRITEDATA, &buffer);
    CURLcode result = curl_easy_perform(curl);
    
    int somecallback(char * data, int size, int nmemb, std::string * buffer){
       buffer += std::string(data);
       return (size * nmemb);
    }
    Your string will now have the html data from google.com. You may also need to register an errorbuffer, I"m not sure if it's required.

  3. #3
    Join Date
    Jun 2009
    Posts
    40

    Re: getting html source code with c++

    Ok I am somewhat confused by how to use this. I have downloaded libcurl and opened up the project in VC++ IDE so it has all the headers, source files, etc. But now that I have the whole thing in a project, how do I know which functions of what you just showed me are in which header files (aka to know which ones I need to use #include in my program). Do I have to open them up and look through them all? I assume there's a better way.

    I am just not sure what to include above the code you gave me to make it work, since I don't know where those functions are in the millions of headers included.

  4. #4
    Join Date
    Jan 2009
    Posts
    1,689

    Re: getting html source code with c++

    #include <curl/curl.h>

    That's the only one that you need, you should have downloaded a precompiled library with it. Then you add libcurl.a to your linking parameters. You don't need them in your project at all, that's the point of a library.

  5. #5
    Join Date
    Jun 2009
    Posts
    40

    Re: getting html source code with c++

    so I found this site to help me compile the .dll and .lib file and implement it, but my version is slightly different and when I compile the .dll, the debug folder only has the .dll file and not the .lib file that apparently I need for implementation. I'm not sure how to get this file, so could someone show me how to make the .lib file

    (site i've been looking at for instructions http://curl.haxx.se/libcurl/c/visual_studio.pdf)

  6. #6
    Join Date
    Jan 2009
    Posts
    1,689

    Re: getting html source code with c++

    There is a precompiled version of libcurl on their website. I don't know how to compile the lib file in VS, I only know the GNU compiler. Are you sure it's not there, it may have a .a extension instead.

    I think this download has a precompiled library with it: http://www.gknw.net/mirror/curl/win3...el-mingw32.zip
    Last edited by ninja9578; August 21st, 2009 at 10:37 AM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured