CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 7 of 7
  1. #1
    Join Date
    Mar 2000
    Location
    Kaysville, UT
    Posts
    228

    Good algorithm for making spelling suggestions?

    I created a basic spell checker that works very well, and is very fast. Right now all it does is tell you a word isn't spelled correctly, and I'd like to expand it a bit.

    I created a class that returns suggested spellings for misspelled words, based on the SOUNDEX algorithm. There are some weaknesses to a pure SOUNDEX system, and I was wondering if there's a better algorithm out there that someone can suggest.

    "There's nothing more dangerous than a resourceful idiot." ---Dilbert
    BWAHAHAHAHAHAHA! ---Murray

  2. #2
    Join Date
    Jan 2001
    Location
    Germany
    Posts
    222

    Re: Good algorithm for making spelling suggestions?

    What is the SOUNDEX algorithm and where can I find information about it?

    ----------------
    You can contact me directly at [email protected]
    Hey, and... don't forget your parsley cause you can't eat your dog after having stolen him from some animal shelter and having drowned him in the Atlantic Ocean.
    Teamwork Software - Stuff That Does Something

  3. #3
    Join Date
    May 2001
    Posts
    594

    Re: Good algorithm for making spelling suggestions?

    Possible alternatives, all three available in PHP.

    Metaphones. improved soundex concept.
    LevenshteinDistance. returns an int, number of changes needed to go from one string to another.
    Oliver[1993] algorithm. O(N^3). Don't know a lot more about this.

    I can supply implementations for the first two, and would love to have an implementation of the last one.
    I would also like to know of any non US.ENGLISH soundex data's for Soundex.

    Implementations are also available in my GenJavaCore library.

    http://www.generationjava.com/java/GenJavaCore.shtml

    Bayard
    [email protected]
    http://www.generationjava.com

    Brainbench MVP for Java
    http://www.brainbench.com

  4. #4
    Join Date
    Mar 2000
    Location
    Kaysville, UT
    Posts
    228

    Re: Good algorithm for making spelling suggestions?

    The SOUNDEX algorithm is an algorithm used to convert words into a 1 character, 3 digit code that can be used for phonetic comparisons.

    More on it can be found at http://www.myatt.demon.co.uk/sxalg.htm

    "There's nothing more dangerous than a resourceful idiot." ---Dilbert
    BWAHAHAHAHAHAHA! ---Murray

  5. #5
    Join Date
    Mar 2000
    Location
    Kaysville, UT
    Posts
    228

    Re: Good algorithm for making spelling suggestions?

    Thanks for the lead. I got my hands on the specs for the Metaphone algorithm a few minutes before I saw your post, but I've never heard of the LevenshteinDistance algorithm. I'll take a look at it!

    "There's nothing more dangerous than a resourceful idiot." ---Dilbert
    BWAHAHAHAHAHAHA! ---Murray

  6. #6
    Join Date
    May 2001
    Posts
    594

    Re: Good algorithm for making spelling suggestions?

    Any idea on languages for soundex other than US english?

    Firstly for other english's, then for other ascii based, and then for unicode?

    Bayard
    [email protected]
    http://www.generationjava.com

    Brainbench MVP for Java
    http://www.brainbench.com

  7. #7
    Join Date
    Mar 2000
    Location
    Kaysville, UT
    Posts
    228

    Re: Good algorithm for making spelling suggestions?

    As long as you were familiar with the different sounds and tones of the language, I don't think it would be hard at all to alter the soundex algorithm to work. Romance languages would probably be the easiest. A language like Russian wouldn't be difficult, either.

    "There's nothing more dangerous than a resourceful idiot." ---Dilbert
    BWAHAHAHAHAHAHA! ---Murray

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured