Quuikest way to determine if an array element exists in a string
CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 10 of 10

Thread: Quuikest way to determine if an array element exists in a string

Hybrid View

  1. #1
    Join Date
    Mar 2013
    Location
    Terlingua, Texas, USA
    Posts
    11

    Quuikest way to determine if an array element exists in a string

    I've been looking for most of the day so far and can find a plethora of ways to determine if a string is present in an array element but that's not what I'm looking for.

    I have a string array that contains authors names, about 175,000 of them.
    I have a string that contains (possibly) a book title, series and author name - though not necessarily in that order.

    Currently I'm taking the brute force approach (code follows), but there has to be a better way and I'm too involved with the results to see it. Since I'm constantly looking at this array in my code, I'm spending almost 35% of my time in [b]this[b] code so any improvement would be beneficial.

    Code:
       Public Function FindAuthors(ByVal file As String) As String
            '   Getting the same name twice (or more?) possibly if the name appears in 
            '   different forms on the same title? -- Fix 1
            If file = "" Then Return ""
            If file.Substring(0, 1) = "*" Then Return file  ' if the author is an "*" then it's a special case possibly no author, or a magazine ...
            Dim loc As Integer
            Dim strTemp As String = CleanInput(file)  ' regex remove all except alphanumeric and white space
            Dim strTempAuthor As String = Nothing
            Dim myOptions As StringComparison = StringComparison.CurrentCultureIgnoreCase
            ' TODO: need to be able to eliminate names which are partials, eve adam <-> steve adams
    
            Dim strAuthorsOut As String = Nothing
            For Each kvp As KeyValuePair(Of String, Integer) In AuthorSearchDict
                 If kvp.Key <> "" Then
                    loc = strTemp.IndexOf(kvp.Key)
                    If loc > -1 Then
                        strTempAuthor = IIf(cbFNF.Checked, AuthorDict(AuthorSearchDict(kvp.Key)).FNL, AuthorDict(AuthorSearchDict(kvp.Key)).LNF).ToString
                        If strAuthorsOut = Nothing Then
                            strAuthorsOut &= strTempAuthor & "; "
                        ElseIf strAuthorsOut.IndexOf(strTempAuthor, myOptions) = -1 Then      'fix 1
                            strAuthorsOut &= strTempAuthor & "; "
                        End If
                    End If
                End If
            Next
            If Len(strAuthorsOut) > 0 Then strAuthorsOut = Microsoft.VisualBasic.Left(strAuthorsOut, Len(strAuthorsOut) - 2)
            Return strAuthorsOut
        End Function
    Any ideas short of my continued brute force.

  2. #2
    Join Date
    Mar 2013
    Location
    Terlingua, Texas, USA
    Posts
    11

    Re: Quuikest way to determine if an array element exists in a string

    Should have noted, this is being done in VB 2005.
    This dictionary (not an array, though I treat it pretty much as such) is searched and the kvp.data is the index to the desired name in the master array (another dictionary).

  3. #3
    Join Date
    Jan 2006
    Location
    Chicago, IL
    Posts
    14,873

    Re: Quuikest way to determine if an array element exists in a string

    I'd use SQL Server to store and search the data.
    David

    CodeGuru Article: Bound Controls are Evil-VB6
    2013 Samples: MS CODE Samples

    CodeGuru Reviewer
    2006 Dell CSP
    2006, 2007 & 2008 MVP Visual Basic
    If your question has been answered satisfactorily, and it has been helpful, then, please, Rate this Post!

  4. #4
    Join Date
    Mar 2013
    Location
    Terlingua, Texas, USA
    Posts
    11

    Re: Quuikest way to determine if an array element exists in a string

    Remeber David, I'm looking to see if any ( one or more ) element in the dictionary is in the string. I've had a fellow who uses SQL on a daily basis say the same thing, then when I passed him the data he couldn't do it.
    My last database work was back on an Hp300 and haven't had the opportunity to learn or use SQL, if you feel it could be done, can you offer some pointers as to how??

  5. #5
    Join Date
    Jan 2006
    Location
    Chicago, IL
    Posts
    14,873

    Re: Quuikest way to determine if an array element exists in a string

    or break names down into firstname/lastname as separate lists, and compare once.
    David

    CodeGuru Article: Bound Controls are Evil-VB6
    2013 Samples: MS CODE Samples

    CodeGuru Reviewer
    2006 Dell CSP
    2006, 2007 & 2008 MVP Visual Basic
    If your question has been answered satisfactorily, and it has been helpful, then, please, Rate this Post!

  6. #6
    Join Date
    Mar 2013
    Location
    Terlingua, Texas, USA
    Posts
    11

    Re: Quuikest way to determine if an array element exists in a string

    Quote Originally Posted by dglienna View Post
    or break names down into firstname/lastname as separate lists, and compare once.
    Respectfully, David, either you don't understand what I'm trying to do or I'm too dense to understand what you're suggesting. Your one line comments are worth nothing as I see it. //al

  7. #7
    Join Date
    Aug 2009
    Location
    NW USA
    Posts
    173

    Re: Quuikest way to determine if an array element exists in a string

    What are the min/max "words" in the AuthorSearchDict key fields (i.e. FirstName, MiddleName, LastName, etc. etc.) Maybe 2, 3 or 4? Take your input string and create sequential "names" from the string. Say the min words is two (at least a first & last) and the max is four. Then "Call of the Wild by Jack London" becomes these possible "names":

    Call of
    of the
    the Wild
    Wild by
    by Jack
    Jack London
    Call of the
    of the wild
    the wild by
    Wild by Jack
    by Jack London
    Call of the Wild
    of the Wild by
    the Wild by Jack
    Wild by Jack London

    Then do a Dictionary.ContainsKey on these possiblities. I don't know if it's faster but it is different. This would also fix your To-Do

  8. #8
    Join Date
    Mar 2013
    Location
    Terlingua, Texas, USA
    Posts
    11

    Re: Quuikest way to determine if an array element exists in a string

    Mur16, now that's an interesting thought. ..... Hmmm, he says ... this sequential search is becoming entirely too long.
    min is two, but there are some that are considerably longer .... most of the book titles, even assuming a long title could be broken down easily and containskey has got to be fast than my sequential .... rambling ....
    That is a very interesting thought, appreciate it much .... as he wanders of to consider ramifications ....

    Just to explain the odd logic already in place ... Authors.txt looks like:
    <author lnf>, <author fnl>, <author name variations> for example:
    le Carre, John, John le Carre, Carre, John le, John leCarre
    These are extracted into a two dimensional Author array of:
    <author lnf>, <author fnl>
    and a dict AuthorSearch of:
    <name>, <Author Index>
    Last edited by AlJones; March 22nd, 2013 at 05:44 PM.

  9. #9
    Join Date
    Jan 2006
    Location
    Chicago, IL
    Posts
    14,873

    Re: Quuikest way to determine if an array element exists in a string

    XML to SQL isn't that hard. You are searching for a 1 to many Relationship. Common DB talk
    David

    CodeGuru Article: Bound Controls are Evil-VB6
    2013 Samples: MS CODE Samples

    CodeGuru Reviewer
    2006 Dell CSP
    2006, 2007 & 2008 MVP Visual Basic
    If your question has been answered satisfactorily, and it has been helpful, then, please, Rate this Post!

  10. #10
    Join Date
    Mar 2013
    Location
    Terlingua, Texas, USA
    Posts
    11

    Re: Quuikest way to determine if an array element exists in a string

    I've got to do a little house cleaning, but this works like a champ - loading one of the other tables, where I do an author look-up, went so fast I thought something was wrong!
    Code:
            If file = "" Then Return ""
            If file.Substring(0, 1) = "*" Then Return file
    
            Dim strTemp As String = CleanInput(file)
            Dim strTempAuthor As String = Nothing
            Dim strAuthorsOut As String = Nothing
            Dim arr1 As String() = file.Split()
            Dim names As New ArrayList
            Dim arr2(150) As String
            Dim i As Integer = 0
            Dim j As Integer = 2
            While j + i <= arr1.Length
                While i + j <= arr1.Length
                    Array.Copy(arr1, i, arr2, 0, j)
                    strTempAuthor = Trim(Join(arr2))
                    If AuthorSearchDict.ContainsKey(strTempAuthor.ToLower) Then names.Add(strTempAuthor)
                    i += 1
                End While
                j += 1
                i = 0
            End While
            strAuthorsOut = ""
            Dim aname As String
            For Each aName In names
                strAuthorsOut = strAuthorsOut & aname & "; "
            Next
            If strAuthorsOut <> Nothing AndAlso strAuthorsOut.Length > 0 Then strAuthorsOut = Microsoft.VisualBasic.Left(strAuthorsOut, Len(strAuthorsOut) - 2)
            Return strAuthorsOut
    Be more than glad to take any other constructive suggestions you might have!
    Last edited by AlJones; March 24th, 2013 at 09:31 AM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Azure Activities Information Page

Windows Mobile Development Center


Click Here to Expand Forum to Full Width

This is a CodeGuru survey question.


Featured


HTML5 Development Center