CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 2 of 2
  1. #1
    Join Date
    Jan 2013
    Posts
    1

    Finding deleted items in an ordered list?

    Hello,

    is there a good algorithm for finding deleted items in an ordered list?
    The ids are autoincremented such that a deleted id will never be used (un-deleted) again.

    Example:

    t=1
    source list: 1,2,3,4,5,6,7,8,9
    target list: 1,2,3,4,5,6,7,8,9
    first sync: source is equal target

    t=2
    source list: 1,2,3,5,6,8,9 (4 and 7 were deleted)
    target list: 1,2,3,4,5,6,7,8,9
    challenge: find the deleted items in the source and delete them in the target


    Current algorithm is very simple but slow:

    1. save the start time of the iteration
    2. iterate through all items of the source list and update the last check timestamp in the target
    3. consider all items with last check timestamp older than the start time from 1 as deleted
    4. double check that the items do not exist and delete them in the target

    In the real application the length of the list is round about 10^6 items. Iterating through all items takes some time and produces much traffic.

    Is there an algorithm that could do it faster?
    I am thinking about some recursion with divide and conquer approach...

  2. #2
    Join Date
    Feb 2011
    Location
    United States
    Posts
    1,016

    Re: Finding deleted items in an ordered list?

    Nothing seems fundamentally wrong with your algorithm. Execution time is O(n). I don't think you can improve that by divide and conquer.

    I think the timestamping is a little unusual though. Instead, I think something like this will work:

    Supposing two lists of numbers (src and tgt) and two indexes into those lists (srcIndex, tgtIndex):

    Code:
    srcIndex = 0;
    tgtIndex = 0;
    
    loop:
        if target[tgtIndex] < src[srcIndex]:
            delete target[tgtIndex]
            tgtIndex = tgtIndex + 1
    
        else if srcIndex[srcIndex] < tgt[tgtIndex]:
            srcIndex++
    
        else: //Equal
            tgtIndex++
            srcIndex++
    Pretty much you step through the lists. On each iteration, if the value you're looking at in the source list (src[srcIndex]) is bigger than the value you're looking at in the target list tgt[tgtIndex], you know you've already "passed" where it should be in the source list, and you can delete it from target.

    On the other hand if the source list value is smaller, just increment the value you're looking at until you either (a) exactly match the target value or (b) pass the target value. On the next iteration: if (a): don't delete, because we found the target value in the source list; if (b) do delete, because we passed it.

    I think there is no logic bug there. You might just try to implement it and check. Nuzzle, look OK to you?
    Best Regards,

    BioPhysEngr
    http://blog.biophysengr.net
    --
    All advice is offered in good faith only. You are ultimately responsible for effects of your programs and the integrity of the machines they run on.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured