|
-
January 6th, 2010, 10:55 AM
#1
CInternetSession connection problems
Hi,
Im having a recurring problem with CInternetSession OperURL
Im looping through a collection of URLS and downloading these pages.
On ocassion I get genuine errors like timeouts etc and it just moves onto the next page with no problems.
But then on no particular page i will get a timout error on connection and then this will occur for all subsequent pages and never recovers. The error wlll eventually change to could not establish connection with server.
Although I can connect using the url with the browser or something, so the server is not down!
Any ideas to what the problem could be?
Thanks
-
January 8th, 2010, 01:21 PM
#2
Re: CInternetSession connection problems
Ok, the above was a bit obscure, but worth a shot
I came to the conclusion that there was a indirect problem. In fact if I just loop through the URLS
then no timeouts no problem. I managed to isolate the problem tho the following code.
After Iv connected and downloaded the file and copied its contents into a CString(compiled with UNICODE) I then want to extract just the text from the HTML source.
WHAT IS WRONG WITH THIS CODE?????
void PageSourceHTMLDocument::ExtractText()
{
MSHTML::IHTMLDocument2Ptr pDoc;
HRESULT hr = CoCreateInstance(CLSID_HTMLDocument, NULL, CLSCTX_INPROC_SERVER,
IID_IHTMLDocument2, (void**)&pDoc);
// IHTMLDocument only takes SafeArray as a parameter
SAFEARRAY* psa = SafeArrayCreateVector(VT_VARIANT, 0, 1);
VARIANT *param;
bstr_t bsData = (LPCTSTR)mSource.c_str();
hr = SafeArrayAccessData(psa, (LPVOID*)¶m);
param->vt = VT_BSTR;
param->bstrVal = (BSTR)bsData;
// apply changes
pDoc->put_designMode(CComBSTR("on")); // prevent script error dialog popups etc
hr = pDoc->write(psa); //write your buffer
hr = pDoc->close();
bsData.Detach(); // clear here as safearray will free too
SafeArrayUnaccessData(psa);
SafeArrayDestroy(psa);
// Now get text from body element.
MSHTML::IHTMLElementPtr body_element;
hr = pDoc->get_body(&body_element);
BSTR bstr;
hr = body_element->get_outerText(&bstr);
if( bstr != NULL )
mText = bstr;
else
mText.clear();
::SysFreeString(bstr);
body_element->Release();
}
There is a memory leak somewhere in here as well, a big one before i released the BSTR.
Under circumstances I do not understand a subsequent call to CInternetSession::OpenUrl
will fail with a timeout( not because of the server). This is usually after about alot of urls have been successfully parsed with above code but not always.
And ideas?
Tags for this Thread
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|