CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 8 of 8
  1. #1
    Join Date
    Aug 2007
    Posts
    63

    File stream and size

    Hello,
    I use the following function to read a file and I put all the char into a string:

    Code:
    int read_file_into_string(string dataFile, string &readData)
    {
           ifstream ifile(dataFile.c_str());
    	   if (!ifile)//check if there is any error
            {
    	    	cout << "Error: impossible to open the file \"" << dataFile << "\"" << endl;
    	    	return-1;
            }
        	else 
    	   {
    
           char ch;
           while ( ifile.get( ch ))
           {
           readData.append ( 1, ch );//push_back(ch);
           }
           }
           return 0;
    }
    What I noticed is that the operation is done with a correct result for about 2100 characters, while my file contains many more!!
    So there is the termination of the string at around 2100 characters and I cannot see the remaining ones. My question is: why?

  2. #2
    Join Date
    Aug 2000
    Location
    West Virginia
    Posts
    7,721

    Re: File stream and size

    You should open the file in binary mode

    Code:
    ifstream ifile(dataFile.c_str() , ios::binary);
    There are more efficient ways of doing that. But first verify
    that the open mode is the problem.

  3. #3
    Join Date
    Aug 2007
    Posts
    63

    Re: File stream and size

    Quote Originally Posted by Philip Nicoletti View Post
    You should open the file in binary mode

    Code:
    ifstream ifile(dataFile.c_str() , ios::binary);
    There are more efficient ways of doing that. But first verify
    that the open mode is the problem.
    You are absolutely right!!! It works....and I guessed there were better ways since I am a principiant...which ones??

  4. #4
    Join Date
    Aug 2000
    Location
    West Virginia
    Posts
    7,721

    Re: File stream and size

    The basic problem is with appending a single character to the
    string the way you currently are doing it. For large files, that
    will require the string to do re-allocations and copies, slowing
    the process down.

    One way around this is to get the size of the file and resize the
    string to that size once.

    Code:
    int read_file_into_string(const std::string & dataFile, std::string &readData)
    {
        std::ifstream ifile(dataFile.c_str() , std::ios::binary);
    
        if (!ifile) //check if there is any error
        {
             std::cout << "Error: impossible to open the file \"" << dataFile << "\"" << std::endl;
             return -1;
        }
        else 
        {
             ifile.seekg(0,std::ios::end);
             size_t length = ifile.tellg();
             ifile.seekg(0,std::ios::beg);
    
             readData.resize(length,0);
    
             // include <algorithm> and include <iterator> for next line
             std::copy( std::istreambuf_iterator< char >(ifile) ,
                        std::istreambuf_iterator< char >() , 
                        readData.begin() );    
        }
        return 0;
    }

  5. #5
    Join Date
    Aug 2007
    Posts
    63

    Re: File stream and size

    Quote Originally Posted by Philip Nicoletti View Post
    The basic problem is with appending a single character to the
    string the way you currently are doing it. For large files, that
    will require the string to do re-allocations and copies, slowing
    the process down.

    One way around this is to get the size of the file and resize the
    string to that size once.

    Code:
    int read_file_into_string(const std::string & dataFile, std::string &readData)
    {
        std::ifstream ifile(dataFile.c_str() , std::ios::binary);
    
        if (!ifile) //check if there is any error
        {
             std::cout << "Error: impossible to open the file \"" << dataFile << "\"" << std::endl;
             return -1;
        }
        else 
        {
             ifile.seekg(0,std::ios::end);
             size_t length = ifile.tellg();
             ifile.seekg(0,std::ios::beg);
    
             readData.resize(length,0);
    
             // include <algorithm> and include <iterator> for next line
             std::copy( std::istreambuf_iterator< char >(ifile) ,
                        std::istreambuf_iterator< char >() , 
                        readData.begin() );    
        }
        return 0;
    }
    Wow, I tried that on a 3 MB file and you are absolutely right. Then it works like a vector resizing itself everytime I append something...nice to know that! THANK YOU

  6. #6
    Join Date
    Aug 2007
    Posts
    63

    Re: File stream and size

    Well,
    now I am really working with vectors and I have a runtime error:

    Code:
    First-chance exception in My_Program.exe: 0xC0000005: Access Violation.
    I guess it's a memory access problem. In fact I have(want) to use some vectors and everytime I store something(many times) I guess they resize themselves consuming time and memory. I don't know the real size from the start so the vectors resize themselves all the time I push_back some element. Is there a way to avoid or reduce this problem? I guess the resize or the reserve functions should work. Which one should I use, and why?

    Ty for the answers

  7. #7
    Join Date
    Mar 2009
    Location
    Bulgaria
    Posts
    63

    Re: File stream and size

    As long as you don't know the maximum expected size of the vector, it's very unlikely that the compiler does. Neither reserve(), nor resize() will save you in this very case. So when you push_back, it may reallocate, that's it. And that's why you're using vector, right?
    If you don't know the maximum expected size, the best I can think of is to reserve some "expected" size and reduce reallocations this way. You know it best how much "expected" should be - it shouldn't be that much, because you're wasting memory and speed, and it shouldn't be that little, because you want to reduce reallocations anyway. So it's all up to you.

    About the runtime error - I suppose you use operator[] to access some memory, that's actually not allocated for you. For example, you have four elements in the vector, and you're trying to index the ninth. Or something like this... This usually happens when when you try to "push back" an element this way: v[8] = 5; when you don't have the vector properly resized.

    Probably it will be best for you if you first .reserve() the "expected" size and then add elements using push_back(). You must not index using operator[] when you're not sure that the vector is properly resized. Or if you want to, make sure that the index is less than .size().

    A few words about reserve() and resize()... Calling reserve() only allocates memory to avoid further reallocations. It doesn't change the size of the vector, it doesn't call any constructors. resize() on the other hand changes the size of the vector, allocates memory, and call all the constructors. If you plan to resize the vector, so that you'll be able to change it indexing the elements like this: v[8] = 5; you should use resize(). If you plan to add elements using push_back() and you want to reduce reallocations, then use reserve().
    Last edited by yzaykov; March 24th, 2009 at 11:07 AM.

  8. #8
    Join Date
    Aug 2007
    Posts
    63

    Re: File stream and size

    Quote Originally Posted by yzaykov View Post
    As long as you don't know the maximum expected size of the vector, it's very unlikely that the compiler does. Neither reserve(), nor resize() will save you in this very case. So when you push_back, it may reallocate, that's it. And that's why you're using vector, right?
    If you don't know the maximum expected size, the best I can think of is to reserve some "expected" size and reduce reallocations this way. You know it best how much "expected" should be - it shouldn't be that much, because you're wasting memory and speed, and it shouldn't be that little, because you want to reduce reallocations anyway. So it's all up to you.

    About the runtime error - I suppose you use operator[] to access some memory, that's actually not allocated for you. For example, you have four elements in the vector, and you're trying to index the ninth. Or something like this... This usually happens when when you try to "push back" an element this way: v[8] = 5; when you don't have the vector properly resized.

    Probably it will be best for you if you first .reserve() the "expected" size and then add elements using push_back(). You must not index using operator[] when you're not sure that the vector is properly resized. Or if you want to, make sure that the index is less than .size().

    A few words about reserve() and resize()... Calling reserve() only allocates memory to avoid further reallocations. It doesn't change the size of the vector, it doesn't call any constructors. resize() on the other hand changes the size of the vector, allocates memory, and call all the constructors. If you plan to resize the vector, so that you'll be able to change it indexing the elements like this: v[8] = 5; you should use resize(). If you plan to add elements using push_back() and you want to reduce reallocations, then use reserve().
    Ok, in the end I used reserve, and moreover I don't need to keep all the data...the latter saved a lot of memory and avoided the runtime error. Just for the moment I will use reserve with some "good" size, when I am sure of the average dimensions of the vectors and if these vectors are really useful (it depends on many variables) I will provide to correct the code.
    Anyway, thank you again.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured