-
May 14th, 2011, 06:39 AM
#1
MFC - Downloading webpages, managining dynamic content
I'm trying to download full websites using CInternetSession. I'm trying to do something similar to Jdownloader- http://jdownloader.org/
Occured few problems:
1. There a "blind spots" when im opening .htm- no images at all, except google ads. So the question is: Do i have to parse source code, fish out all "href" links and download additional data separately into appropriate folders? like:<img href="/image/logo.jpg"
2. What about redirects? Frames loading other websites after 4ex. ~2 second. How can i force my code to wait for additional data?
3. Is there any possibility to use threads with CInternetSession to download and manage 10 websites simultaneously(Asynchronous communication)?
4. Finally - dynamic content. What about javascript counters 4ex. filesonic.com,duckload.com etc. How can i actually click buttons after 30 seconds? How can i close ad banners via "X" button on the frame? Right now my code downloads webpage as it is in "this" moment.
That's my code:
Code:
CInternetSession session;
CHttpConnection* pServer = NULL;
CHttpFile* pFile = NULL;
DWORD dwRet;
try
{
CString buba("filesonic.com");
CString headerki("User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)\r\nAccept: image/gif, image/png, image/x-xbitmap, image/jpeg, image/jpg, image/pjpeg, application/x-ms-application, application/x-ms-xbap, application/vnd.ms-xpsdocument, application/x-shockwave-flash, application/vnd.ms-xpsdocument, application/x-ms-xbap, application/x-ms-application, application/x-silverlight, */*\r\nConnection: Keep-Alive\r\nAccept-Language: en-US\r\n");
INTERNET_PORT nPort=80;
pServer = session.GetHttpConnection((LPCTSTR)buba,INTERNET_FLAG_NO_CACHE_WRITE,80);
pFile = pServer->OpenRequest(CHttpConnection::HTTP_VERB_GET, _T(""));
errors = pFile->AddRequestHeaders(headerki);
pFile->SendRequest(headerki);
pFile->QueryInfoStatusCode(dwRet);
if (dwRet == HTTP_STATUS_OK)
{
char *buff = new char[1024] ;
CFile fp;
fp.Open(_T("test.html"), CFile::modeCreate | CFile::modeReadWrite ) ;
int bytes;
while (pFile->Read(buff, 1023))
{
fp.Write ( buff, 1023 ) ;
for ( int i = 0 ; i < 1023 ; i++ )
buff[i] = 0 ;
}
fp.Close( );
delete[ ] buff ;
}
delete pFile;
delete pServer;
}
-
May 21st, 2011, 01:43 PM
#2
Re: MFC - Downloading webpages, managining dynamic content
Really? None had similar problem? Any advice would be nice
Tags for this Thread
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|