CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 5 of 5
  1. #1
    Join Date
    Nov 2004
    Posts
    6

    Question how to compare two files?

    hi! just want to ask if you have any idea in identifying difference in two files.
    i have a 2 huge files(names) and i want findout if theres a duplicates... and merge it.

    thank you in advance

    benjz

  2. #2
    Join Date
    Jul 2003
    Location
    Florida
    Posts
    651

    Re: how to compare two files?

    There's a program called CSDiff (free download) that you can use. It has commandline parameters that can easily be passed to it through VB code using Shell (or something similar).
    I'd rather be wakeboarding...

  3. #3
    Join Date
    Sep 2002
    Location
    England
    Posts
    530

    Re: how to compare two files?

    Hi

    If Sourcesafe is installed on the target PC (or maybe even a remote PC you can connect to, but not too sure about this one), you can use ss commands from a command prompt (shell it) or set a reference to Microsoft Sourcesafe 6.0 type library (though haven't tried this, so dunno, but I'm sure it'll work) in your vb project and identify differences and merge if any...

  4. #4
    Join Date
    Nov 2004
    Posts
    6

    Re: how to compare two files?

    actually what i want to do is, identifying the duplicates of two files/tables. then merge it into 1 file/table. my apology if my scenario wasnt clear. hope you can give helpful tips. thank you very much!=)

  5. #5
    Join Date
    Oct 2003
    Location
    .NET2.0 / VS2005 Developer
    Posts
    7,104

    Re: how to compare two files?

    well now, youre either talking about files, or youre talking about tables. youre going to have to be more specific

    file comparison is usually done on a binary level, whereaby two files are scanned sequentially from offset zero (the start). as soon as two bytes are encountered that are different, you have a decision to make was to what to do.

    most file comparators use a simple difference engine that looks for differences in absolute bytes, meaning that the following files (read as a stream left to right) are approximately 50% different:

    abcdefA
    abcAdef


    d<>A and for every byte after, the bytes are different, e<>d, f<>e ...

    other file comparators work on a line-by-line basis (source code control for example).
    some smart comparators will use a gapping technique whereby, upon finding a difference, each file is searched from that point for the other byte. the one that finds first, is where comparison continues from, so our files above:


    abcdefA
    abcAdef

    d<>A so the algorithm looks for A in the first file as well as d in the second file. It finds d after searching 1 byte, so the comparison continues from that file offset by +1byte, and the other file is not offset

    so the first 3 bytes, abc in both files match
    then A <> d, do when d is found in the second file +1byte, a gap of 1 byte is "inserted":

    abc_defA
    abcAdef


    as you can see the difference enging will then find another equality of "def" in both files.. and it would come to the conclusion that the files are 80-90% identical



    --

    comparing tables ina database is a whole other matter entirely, and you can use joins to tell you how identical two tables are.. but you need to clarify your question
    "it's a fax from your dog, Mr Dansworth. It looks like your cat" - Gary Larson...DW1: Data Walkthroughs 1.1...DW2: Data Walkthroughs 2.0...DDS: The DataSet Designer Surface...ANO: ADO.NET2 Orientation...DAN: Deeper ADO.NET...DNU...PQ

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured