Merge file sort C++ realization.
CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 10 of 10

Thread: Merge file sort C++ realization.

  1. #1
    Join Date
    Jun 2009
    Posts
    8

    Merge file sort C++ realization.

    Hello,

    For example, I have files with sorted numbers. One way to merge them and save a sorted order is to use a merge sort algorithm. Where a C++ realization of the algorithm (for files, not for memory) can be found ?

  2. #2
    Join Date
    May 2009
    Posts
    23

    Re: Merge file sort C++ realization.

    This was like my assignment long time ago when I was studying.

    Here the algorithm, u can adjust it as needed.

    1. Create 27 integer random numbers, and real numbers, and store them to a new file
    merge.dat
    2. Read the first three integers & reals from merge.dat, sort and store them to a
    new file merge.1, then close the merge.1
    3. Read the second three integers & reals from merge.dat, sort and store them to a
    new file merge.2, then close the merge.2
    4. Keep going until all numbers have been read (the last file is merge.9).
    5. Read the first integer from merge.1, merge.2, and merge.3, sort them, and store them to
    a new file merge.10.
    6. Keep going untill all numbers from each file have been read. (the last file is merge,13).

    In my assignment I was created 27 numbers only.

  3. #3
    Join Date
    Jan 2006
    Location
    Singapore
    Posts
    6,250

    Re: Merge file sort C++ realization.

    I think binyo66 suggested a basic, workable, idea. If it is feasible to sort the contents of each file in memory (e.g., by reading into a std::vector and using std::sort or std::stable_sort), then you might consider doing so and then write the contents to file, upon which you can apply std::merge to (repeatedly) merge the contents of the various files to new destination files.
    C + C++ Compiler: MinGW port of GCC
    Build + Version Control System: SCons + Bazaar

    Look up a C/C++ Reference and learn How To Ask Questions The Smart Way
    Kindly rate my posts if you found them useful

  4. #4
    Join Date
    Jun 2009
    Posts
    8

    Re: Merge file sort C++ realization.

    Quote Originally Posted by laserlight View Post
    I think binyo66 suggested a basic, workable, idea. If it is feasible to sort the contents of each file in memory (e.g., by reading into a std::vector and using std::sort or std::stable_sort), then you might consider doing so and then write the contents to file, upon which you can apply std::merge to (repeatedly) merge the contents of the various files to new destination files.
    But... is it possible to merge files using std::merge?

  5. #5
    Join Date
    Jun 2009
    Posts
    19

    Re: Merge file sort C++ realization.

    But the original poster said he has those sorted files already... so it's more feasible, if the files are large, not to store the files in memory, but just read in the integers one by one, like so:
    for each file as F: heap_insert(readnext(F),F);
    while heap_pop_min as (M,X): output(M); heap_insert(readnext(X),X);

    This would be O(N log M) where N is the total number of elements, M is the number of files.
    Last edited by prime1999; July 11th, 2009 at 12:25 AM. Reason: correcting a mistake

  6. #6
    Join Date
    Jan 2006
    Location
    Singapore
    Posts
    6,250

    Re: Merge file sort C++ realization.

    Quote Originally Posted by prime1999
    But the original poster said he has those sorted files already... so it's more feasible, if the files are large, not to store the files in memory, but just read in the integers one by one
    Good point, in which case std::merge can be used immediately. The problem is no longer really one of sorting, but one of merging.

    Quote Originally Posted by alikTi
    But... is it possible to merge files using std::merge?
    Yes, e.g., given two open input streams in1 and in2, and an open output stream out:
    Code:
    std::merge(std::istream_iterator<int>(in1), std::istream_iterator<int>(),
        std::istream_iterator<int>(in2), std::istream_iterator<int>(),
        std::ostream_iterator<int>(out, " "));
    where the contents of the files represented by in1 and in2 are sorted integers separated by whitespace.
    C + C++ Compiler: MinGW port of GCC
    Build + Version Control System: SCons + Bazaar

    Look up a C/C++ Reference and learn How To Ask Questions The Smart Way
    Kindly rate my posts if you found them useful

  7. #7
    Join Date
    Jun 2009
    Posts
    19

    Re: Merge file sort C++ realization.

    But I doubt std::merge can merge more than two files efficiently, so if you want efficiency, you have to use the heap method.

  8. #8
    Join Date
    Jun 2009
    Posts
    8

    Re: Merge file sort C++ realization.

    Is it possible to read the data and place it to a file on the right place immediately (i.e. don't disturb the order of sorting?)

  9. #9
    Join Date
    Jun 2009
    Posts
    19

    Re: Merge file sort C++ realization.

    Yes... the heap method does that.

  10. #10
    Join Date
    Jul 2009
    Posts
    7

    Re: Merge file sort C++ realization.

    If you have a fixed number of files with already sorted data, and you want to write a new file with the sorted data, why not just do like this (sorry for the Python-like pseudocode, I'm just sketching here)

    Code:
    vals = heap()
    while true:
      for f in inFiles:
        if eof(f): continue
        vals.add(f.read())
      if empty(vals): break
      outFile.write(vals.pop())
    The point is, you don't have to store the whole result in memory, as you know that the lowest entry so far is the globally lowest not yet written to file.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Azure Activities Information Page

Windows Mobile Development Center


Click Here to Expand Forum to Full Width

This is a CodeGuru survey question.


Featured


HTML5 Development Center