-
July 1st, 2009, 02:01 AM
#1
Merge file sort C++ realization.
Hello,
For example, I have files with sorted numbers. One way to merge them and save a sorted order is to use a merge sort algorithm. Where a C++ realization of the algorithm (for files, not for memory) can be found ?
-
July 1st, 2009, 03:35 AM
#2
Re: Merge file sort C++ realization.
This was like my assignment long time ago when I was studying.
Here the algorithm, u can adjust it as needed.
1. Create 27 integer random numbers, and real numbers, and store them to a new file
merge.dat
2. Read the first three integers & reals from merge.dat, sort and store them to a
new file merge.1, then close the merge.1
3. Read the second three integers & reals from merge.dat, sort and store them to a
new file merge.2, then close the merge.2
4. Keep going until all numbers have been read (the last file is merge.9).
5. Read the first integer from merge.1, merge.2, and merge.3, sort them, and store them to
a new file merge.10.
6. Keep going untill all numbers from each file have been read. (the last file is merge,13).
In my assignment I was created 27 numbers only.
-
July 2nd, 2009, 03:50 AM
#3
Re: Merge file sort C++ realization.
I think binyo66 suggested a basic, workable, idea. If it is feasible to sort the contents of each file in memory (e.g., by reading into a std::vector and using std::sort or std::stable_sort), then you might consider doing so and then write the contents to file, upon which you can apply std::merge to (repeatedly) merge the contents of the various files to new destination files.
-
July 2nd, 2009, 04:10 AM
#4
Re: Merge file sort C++ realization.
Originally Posted by laserlight
I think binyo66 suggested a basic, workable, idea. If it is feasible to sort the contents of each file in memory (e.g., by reading into a std::vector and using std::sort or std::stable_sort), then you might consider doing so and then write the contents to file, upon which you can apply std::merge to (repeatedly) merge the contents of the various files to new destination files.
But... is it possible to merge files using std::merge?
-
July 2nd, 2009, 07:33 AM
#5
Re: Merge file sort C++ realization.
But the original poster said he has those sorted files already... so it's more feasible, if the files are large, not to store the files in memory, but just read in the integers one by one, like so:
for each file as F: heap_insert(readnext(F),F);
while heap_pop_min as (M,X): output(M); heap_insert(readnext(X),X);
This would be O(N log M) where N is the total number of elements, M is the number of files.
Last edited by prime1999; July 11th, 2009 at 12:25 AM.
Reason: correcting a mistake
-
July 2nd, 2009, 01:46 PM
#6
Re: Merge file sort C++ realization.
Originally Posted by prime1999
But the original poster said he has those sorted files already... so it's more feasible, if the files are large, not to store the files in memory, but just read in the integers one by one
Good point, in which case std::merge can be used immediately. The problem is no longer really one of sorting, but one of merging.
Originally Posted by alikTi
But... is it possible to merge files using std::merge?
Yes, e.g., given two open input streams in1 and in2, and an open output stream out:
Code:
std::merge(std::istream_iterator<int>(in1), std::istream_iterator<int>(),
std::istream_iterator<int>(in2), std::istream_iterator<int>(),
std::ostream_iterator<int>(out, " "));
where the contents of the files represented by in1 and in2 are sorted integers separated by whitespace.
-
July 4th, 2009, 08:56 AM
#7
Re: Merge file sort C++ realization.
But I doubt std::merge can merge more than two files efficiently, so if you want efficiency, you have to use the heap method.
-
July 7th, 2009, 03:47 AM
#8
Re: Merge file sort C++ realization.
Is it possible to read the data and place it to a file on the right place immediately (i.e. don't disturb the order of sorting?)
-
July 11th, 2009, 12:26 AM
#9
Re: Merge file sort C++ realization.
Yes... the heap method does that.
-
July 21st, 2009, 12:50 PM
#10
Re: Merge file sort C++ realization.
If you have a fixed number of files with already sorted data, and you want to write a new file with the sorted data, why not just do like this (sorry for the Python-like pseudocode, I'm just sketching here)
Code:
vals = heap()
while true:
for f in inFiles:
if eof(f): continue
vals.add(f.read())
if empty(vals): break
outFile.write(vals.pop())
The point is, you don't have to store the whole result in memory, as you know that the lowest entry so far is the globally lowest not yet written to file.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|