CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 2 of 2
  1. #1
    Join Date
    Feb 2009
    Posts
    1

    String matching algorithm

    Hi,

    The problem I am trying to solve is to find the common sequences in a string. Say I have two strings

    aaabbbcccdddeee
    aaabbcccdddeeff

    I would like to devise a measure for how similar these two strings are. So the strings aaa, ccc, ddd would match and might get scores of 1 each. bb and ee match slightly with bbb and eee so these might get scores of .5 each. ff doesn't match anything. Then we get a total score of 4 for how similar the two strings are. The thing to note is that we're allowed to offset the values so you can see that the ccc and ddd values match even though they are offset by a bit.

    The overall context of the problem is for analyzing car drive data. If I have two car drives along streets [A, A, A, B, B, C] (values are repeated because for each lat/lon coordinate in the drive I have a street segment that this point corresponds to) and another one along [A, A, A, D, D, B, B, C], I'd like to say whether these two drives are the same. In this case they're about the same, except we deviate off to drive [D, D] for a bit. I'd like to be able to measure this.

    Thanks in advance.

  2. #2
    Join Date
    Aug 2001
    Location
    Stockholm, Sweden
    Posts
    1,664

    Re: String matching algorithm

    Try searching on fuzzy string matching... Wikipedia gives you this:
    http://en.wikipedia.org/wiki/Bitap_algorithm

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured