March 17th, 2012, 10:16 PM
Need Webcrawler without RSS
I need information pulled from some websites, but the websites do not have RSS. I am looking for a method which can "crawl" and retrieve the information which is updated in the websites when they are updated. I have been told that this can be done by HTML parsing. However, I was also told that if the HTML structures of the websites change that the parsers would have to be updated or rewritten. Are there any other methods of retrieving updated information from websites? In case anyone was wondering, the information I need pulled is from daily deal sites to go into a daily deal aggregator.
Admin - if this is in the wrong section please remove the posting, I apologize; I'm new to the forum.
March 18th, 2012, 09:18 PM
Re: Need Webcrawler without RSS
Have you contacted the site owners to ask if they are willing to provide a suitable RSS feed? After all, this could be to their advantage, so if it makes life simpler for you, why not?
Tags for this Thread
Click Here to Expand Forum to Full Width
This is a Codeguru.com survey!