CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 10 of 10
  1. #1
    Join Date
    Sep 2009
    Posts
    28

    Question Reading large text files

    Hi all,

    Just wondering if anyone has any advice on dealing with very large text files in C++? Basically I'm working on an app that has to scan through and pull out values from a text file which can potentially be over 20MB in size. Is ifstream good enough to load into a buffer, or is there a better way?

    Thanks,

    Payne747

  2. #2
    Lindley is offline Elite Member Power Poster
    Join Date
    Oct 2007
    Location
    Seattle, WA
    Posts
    10,895

    Re: Reading large text files

    20MB? An ifstream should be fine.

    Once you get into the 100MB+ range, you might consider memory-mapping the file instead.

  3. #3
    Join Date
    Sep 2009
    Posts
    28

    Re: Reading large text files

    Thanks, just been doing some tests and it's working fine so far, I've dumped it into a dynamic char, then to string (easier to manipulate):

    char * buffer;
    buffer = new char[fileSize];
    fileIn.read(buffer, fileSize);
    fileIn.close();

    // convert to string then delete buffer
    string sysinfo = buffer;
    delete buffer;


    Is there anything dangerously wrong with this?

    Thanks,
    Payne747

  4. #4
    Lindley is offline Elite Member Power Poster
    Join Date
    Oct 2007
    Location
    Seattle, WA
    Posts
    10,895

    Re: Reading large text files

    Well, you should be using delete[] rather than delete. Personally I'd use a vector<char> for that purpose.

    How are you getting the file size? Seek to the end and tellg()?

  5. #5
    Join Date
    Aug 2000
    Location
    West Virginia
    Posts
    7,725

    Re: Reading large text files

    Quote Originally Posted by Payne747 View Post
    // convert to string then delete buffer
    string sysinfo = buffer;
    This continues to put characters into sysinfo untill it reaches
    a NULL character. You should use the following constructor instead:

    Code:
    fileIn.read(buffer, fileSize);
    You do not show your open statement, but you should open the
    file in binary mode on Windows ... and it will not hurt to open
    in binary mode on Linux systems. So I would open binary.

  6. #6
    Join Date
    May 2009
    Posts
    2,413

    Re: Reading large text files

    Quote Originally Posted by Payne747 View Post
    , or is there a better way?
    If the processing of the file content doesn't require that the whole file is present in memory I think you should read the textfile line by line.

    It's not any harder and the maximum file size your program can handle doesn't become dependent on memory available.

  7. #7
    Join Date
    Apr 2007
    Location
    Mars NASA Station
    Posts
    1,436

    Re: Reading large text files

    Get the file rdbuf to a stringstream.
    Thanks for your help.

  8. #8
    Join Date
    Sep 2009
    Posts
    28

    Re: Reading large text files

    Thanks for the answers, I thought about line by line, but I need to scan the file for values many times, and the values may not be in the same logical order every time, so ideally it's better to have in memory I think.

    File size is obtained with:

    long begin,end,fileSize;
    begin = fileIn.tellg();
    fileIn.seekg(0, ios::end);
    end = fileIn.tellg();
    fileSize = end-begin;

    which seems to work nicely, though I've been looking into the use of vectors, vector<char> is new to me though, this could take some time :P

    Thanks for the help.

  9. #9
    Join Date
    Mar 2003
    Location
    India {Mumbai};
    Posts
    3,871

    Re: Reading large text files

    I do recommend reading the contents line by line, for text based files, or by blocks (of few bytes) if file is binary. For binary files, you may choose some reasonable limit (like 1024 bytes, 8K), and then put the buffer you just read into whichever memory-buffer you choose. You need perform some calculation for reading the last block (since it may be less then allocation size you have chosen).

    You may like to retrieve the default allocation unit size the drive (on which file resides), and read the same number of bytes per unit.
    My latest article: Explicating the new C++ standard (C++0x)

    Do rate the posts you find useful.

  10. #10
    Join Date
    Apr 2005
    Posts
    107

    Re: Reading large text files

    I'd suggest you just glance through this tutorial on text editors...
    http://www.catch22.net/tuts/neatpad/1

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured