May 14th, 2011, 07:39 AM
MFC - Downloading webpages, managining dynamic content
I'm trying to download full websites using CInternetSession. I'm trying to do something similar to Jdownloader- http://jdownloader.org/
Occured few problems:
1. There a "blind spots" when im opening .htm- no images at all, except google ads. So the question is: Do i have to parse source code, fish out all "href" links and download additional data separately into appropriate folders? like:<img href="/image/logo.jpg"
2. What about redirects? Frames loading other websites after 4ex. ~2 second. How can i force my code to wait for additional data?
3. Is there any possibility to use threads with CInternetSession to download and manage 10 websites simultaneously(Asynchronous communication)?
That's my code:
CHttpConnection* pServer = NULL;
CHttpFile* pFile = NULL;
CString headerki("User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)\r\nAccept: image/gif, image/png, image/x-xbitmap, image/jpeg, image/jpg, image/pjpeg, application/x-ms-application, application/x-ms-xbap, application/vnd.ms-xpsdocument, application/x-shockwave-flash, application/vnd.ms-xpsdocument, application/x-ms-xbap, application/x-ms-application, application/x-silverlight, */*\r\nConnection: Keep-Alive\r\nAccept-Language: en-US\r\n");
pServer = session.GetHttpConnection((LPCTSTR)buba,INTERNET_FLAG_NO_CACHE_WRITE,80);
pFile = pServer->OpenRequest(CHttpConnection::HTTP_VERB_GET, _T(""));
errors = pFile->AddRequestHeaders(headerki);
if (dwRet == HTTP_STATUS_OK)
char *buff = new char ;
fp.Open(_T("test.html"), CFile::modeCreate | CFile::modeReadWrite ) ;
while (pFile->Read(buff, 1023))
fp.Write ( buff, 1023 ) ;
for ( int i = 0 ; i < 1023 ; i++ )
buff[i] = 0 ;
delete[ ] buff ;
Tags for this Thread
Click Here to Expand Forum to Full Width