CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 3 of 3
  1. #1
    Join Date
    May 2005
    Posts
    25

    Scraping a website

    I wrote a program about a year ago that scrapes a webpage and dumps the data into a database. The website made some major changes, so I have had to rewrite some of my program. All the search options used to be passed as parameters in the url; now all of the parameters have been removed and the only thing passed is: "?Page=SEARCHRESULTS". I have been able to overcome all the changes except one. One of the parameters I used to pass was one that told how many entries to show. Now it just defaults to 25. This is ok as long as I can figure out a way to navigate through the pages. "javascript:JumpToResultsPage(3);" is how they do it on the website. I am currently using CInternetSession to pull data off the internet, I thought I could do this with internetSession.OpenURL("javascript:JumpToResultsPage(3);");. Apparently you can only pass http, ftp, etc to it.
    How can I go about navigating through the pages?

  2. #2
    Arjay's Avatar
    Arjay is offline Moderator / EX MS MVP Power Poster
    Join Date
    Aug 2004
    Posts
    13,490

    Re: Scraping a website

    Quote Originally Posted by dave18285
    I wrote a program about a year ago that scrapes a webpage and dumps the data into a database. The website made some major changes, so I have had to rewrite some of my program. All the search options used to be passed as parameters in the url; now all of the parameters have been removed and the only thing passed is: "?Page=SEARCHRESULTS". I have been able to overcome all the changes except one. One of the parameters I used to pass was one that told how many entries to show. Now it just defaults to 25. This is ok as long as I can figure out a way to navigate through the pages. "javascript:JumpToResultsPage(3);" is how they do it on the website. I am currently using CInternetSession to pull data off the internet, I thought I could do this with internetSession.OpenURL("javascript:JumpToResultsPage(3);");. Apparently you can only pass http, ftp, etc to it.
    How can I go about navigating through the pages?
    Assuming you have a relationship with the website owners, contact the admin and see if they have an RSS feed or other format that accesses the data directly. Scraping the webpage (as you know) isn't very reliable, and due to its nature, won't ever be.

    Arjay

  3. #3
    Join Date
    May 2005
    Posts
    25

    Re: Scraping a website

    No, I don't have a relationship with the company. This is really my only option... Any ideas?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured