CInternetSession connection problems
Hi,
Im having a recurring problem with CInternetSession OperURL
Im looping through a collection of URLS and downloading these pages.
On ocassion I get genuine errors like timeouts etc and it just moves onto the next page with no problems.
But then on no particular page i will get a timout error on connection and then this will occur for all subsequent pages and never recovers. The error wlll eventually change to could not establish connection with server.
Although I can connect using the url with the browser or something, so the server is not down!
Any ideas to what the problem could be?
Thanks
Re: CInternetSession connection problems
Ok, the above was a bit obscure, but worth a shot
I came to the conclusion that there was a indirect problem. In fact if I just loop through the URLS
then no timeouts no problem. I managed to isolate the problem tho the following code.
After Iv connected and downloaded the file and copied its contents into a CString(compiled with UNICODE) I then want to extract just the text from the HTML source.
WHAT IS WRONG WITH THIS CODE?????
void PageSourceHTMLDocument::ExtractText()
{
MSHTML::IHTMLDocument2Ptr pDoc;
HRESULT hr = CoCreateInstance(CLSID_HTMLDocument, NULL, CLSCTX_INPROC_SERVER,
IID_IHTMLDocument2, (void**)&pDoc);
// IHTMLDocument only takes SafeArray as a parameter
SAFEARRAY* psa = SafeArrayCreateVector(VT_VARIANT, 0, 1);
VARIANT *param;
bstr_t bsData = (LPCTSTR)mSource.c_str();
hr = SafeArrayAccessData(psa, (LPVOID*)¶m);
param->vt = VT_BSTR;
param->bstrVal = (BSTR)bsData;
// apply changes
pDoc->put_designMode(CComBSTR("on")); // prevent script error dialog popups etc
hr = pDoc->write(psa); //write your buffer
hr = pDoc->close();
bsData.Detach(); // clear here as safearray will free too
SafeArrayUnaccessData(psa);
SafeArrayDestroy(psa);
// Now get text from body element.
MSHTML::IHTMLElementPtr body_element;
hr = pDoc->get_body(&body_element);
BSTR bstr;
hr = body_element->get_outerText(&bstr);
if( bstr != NULL )
mText = bstr;
else
mText.clear();
::SysFreeString(bstr);
body_element->Release();
}
There is a memory leak somewhere in here as well, a big one before i released the BSTR.
Under circumstances I do not understand a subsequent call to CInternetSession::OpenUrl
will fail with a timeout( not because of the server). This is usually after about alot of urls have been successfully parsed with above code but not always.
And ideas?