CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 2 of 2

Thread: Search Engine

  1. #1
    Join Date
    Jun 1999
    Posts
    3

    Search Engine

    Please send me the source code for a search engine algorithm which searches for astring or a set of strings on the net if on-line or the network machines while off-line.


  2. #2
    Join Date
    May 2001
    Posts
    594

    Re: Search Engine

    What is this for?

    Do you want to search the whole web, or a particular site?

    algorithm:

    foreach file to be indexed:

    strip html out
    split on spaces
    foreach word
    add index of word to filename

    save index to disk


    Then all your search-engine does is read the index off the disk, and if someone asks for word 'hello' it looks that
    up in the index, gets a list of filenames back and shows them.

    If they enter 'hello world' then it does a search for hello and world, and returns the intersection of those two sets, and then the intersection - the union.

    If you want to start doing things like "hello world" (ie one word) then you'll need to also index sets of words.


    That's pretty simple, but good enough for a website's internal seach I reckon. For more power you can investigate fuzzy text matching so that Hullo will match to Hello as well.

    If you want java that searches the web like google.com or something, just write a front end to google. No way you can match them on your own.

    Bayard
    bayard@generationjava.com

    Brainbench MVP for Java
    http://www.brainbench.com
    Bayard
    bayard@generationjava.com
    http://www.apache.org/~bayard
    http://www.generationjava.com

    Brainbench MVP for Java
    http://www.brainbench.com

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured