|
-
October 2nd, 2009, 04:21 PM
#1
Reading large text files
Hi all,
Just wondering if anyone has any advice on dealing with very large text files in C++? Basically I'm working on an app that has to scan through and pull out values from a text file which can potentially be over 20MB in size. Is ifstream good enough to load into a buffer, or is there a better way?
Thanks,
Payne747
-
October 2nd, 2009, 04:23 PM
#2
Re: Reading large text files
20MB? An ifstream should be fine.
Once you get into the 100MB+ range, you might consider memory-mapping the file instead.
-
October 2nd, 2009, 04:52 PM
#3
Re: Reading large text files
Thanks, just been doing some tests and it's working fine so far, I've dumped it into a dynamic char, then to string (easier to manipulate):
char * buffer;
buffer = new char[fileSize];
fileIn.read(buffer, fileSize);
fileIn.close();
// convert to string then delete buffer
string sysinfo = buffer;
delete buffer;
Is there anything dangerously wrong with this?
Thanks,
Payne747
-
October 2nd, 2009, 05:18 PM
#4
Re: Reading large text files
Well, you should be using delete[] rather than delete. Personally I'd use a vector<char> for that purpose.
How are you getting the file size? Seek to the end and tellg()?
-
October 2nd, 2009, 05:49 PM
#5
Re: Reading large text files
 Originally Posted by Payne747
// convert to string then delete buffer
string sysinfo = buffer;
This continues to put characters into sysinfo untill it reaches
a NULL character. You should use the following constructor instead:
Code:
fileIn.read(buffer, fileSize);
You do not show your open statement, but you should open the
file in binary mode on Windows ... and it will not hurt to open
in binary mode on Linux systems. So I would open binary.
-
October 3rd, 2009, 01:57 AM
#6
Re: Reading large text files
 Originally Posted by Payne747
, or is there a better way?
If the processing of the file content doesn't require that the whole file is present in memory I think you should read the textfile line by line.
It's not any harder and the maximum file size your program can handle doesn't become dependent on memory available.
-
October 3rd, 2009, 03:14 AM
#7
Re: Reading large text files
Get the file rdbuf to a stringstream.
Thanks for your help.
-
October 3rd, 2009, 09:27 AM
#8
Re: Reading large text files
Thanks for the answers, I thought about line by line, but I need to scan the file for values many times, and the values may not be in the same logical order every time, so ideally it's better to have in memory I think.
File size is obtained with:
long begin,end,fileSize;
begin = fileIn.tellg();
fileIn.seekg(0, ios::end);
end = fileIn.tellg();
fileSize = end-begin;
which seems to work nicely, though I've been looking into the use of vectors, vector<char> is new to me though, this could take some time :P
Thanks for the help.
-
October 3rd, 2009, 02:43 PM
#9
Re: Reading large text files
I do recommend reading the contents line by line, for text based files, or by blocks (of few bytes) if file is binary. For binary files, you may choose some reasonable limit (like 1024 bytes, 8K), and then put the buffer you just read into whichever memory-buffer you choose. You need perform some calculation for reading the last block (since it may be less then allocation size you have chosen).
You may like to retrieve the default allocation unit size the drive (on which file resides), and read the same number of bytes per unit.
-
October 5th, 2009, 07:38 AM
#10
Re: Reading large text files
I'd suggest you just glance through this tutorial on text editors...
http://www.catch22.net/tuts/neatpad/1
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|