I want to parse a webpage and extract relevent data from it. My current approach is to download the source of the webpage (HTML codes), then use some keywords to extract the necessary information. However, this just doesn't seem to be the best solution to me.
Is there a way to download the "displayed page" instead of the HTML code?
Thank you,
kab
Last edited by kabilius; March 15th, 2011 at 06:08 PM.
Whenever you download a webpage, you unfortunately download the codes with it. That is how it works. The real question should be, what would be the best approach in extracting the "info"?
How did you do it?
I'm using the SHDocVw and mshtml and never had any problems
I think you have deciphered my ambiguous language correctly.
I guess there just isn't a easier to do what I want than SHDocVW and MSHTML, thank you for pointing me to the right direction.
Bookmarks