-
March 11th, 2009, 08:04 AM
#1
File stream and size
Hello,
I use the following function to read a file and I put all the char into a string:
Code:
int read_file_into_string(string dataFile, string &readData)
{
ifstream ifile(dataFile.c_str());
if (!ifile)//check if there is any error
{
cout << "Error: impossible to open the file \"" << dataFile << "\"" << endl;
return-1;
}
else
{
char ch;
while ( ifile.get( ch ))
{
readData.append ( 1, ch );//push_back(ch);
}
}
return 0;
}
What I noticed is that the operation is done with a correct result for about 2100 characters, while my file contains many more!!
So there is the termination of the string at around 2100 characters and I cannot see the remaining ones. My question is: why?
-
March 11th, 2009, 08:09 AM
#2
Re: File stream and size
You should open the file in binary mode
Code:
ifstream ifile(dataFile.c_str() , ios::binary);
There are more efficient ways of doing that. But first verify
that the open mode is the problem.
-
March 11th, 2009, 08:36 AM
#3
Re: File stream and size
Originally Posted by Philip Nicoletti
You should open the file in binary mode
Code:
ifstream ifile(dataFile.c_str() , ios::binary);
There are more efficient ways of doing that. But first verify
that the open mode is the problem.
You are absolutely right!!! It works....and I guessed there were better ways since I am a principiant...which ones??
-
March 11th, 2009, 08:52 AM
#4
Re: File stream and size
The basic problem is with appending a single character to the
string the way you currently are doing it. For large files, that
will require the string to do re-allocations and copies, slowing
the process down.
One way around this is to get the size of the file and resize the
string to that size once.
Code:
int read_file_into_string(const std::string & dataFile, std::string &readData)
{
std::ifstream ifile(dataFile.c_str() , std::ios::binary);
if (!ifile) //check if there is any error
{
std::cout << "Error: impossible to open the file \"" << dataFile << "\"" << std::endl;
return -1;
}
else
{
ifile.seekg(0,std::ios::end);
size_t length = ifile.tellg();
ifile.seekg(0,std::ios::beg);
readData.resize(length,0);
// include <algorithm> and include <iterator> for next line
std::copy( std::istreambuf_iterator< char >(ifile) ,
std::istreambuf_iterator< char >() ,
readData.begin() );
}
return 0;
}
-
March 11th, 2009, 09:08 AM
#5
Re: File stream and size
Originally Posted by Philip Nicoletti
The basic problem is with appending a single character to the
string the way you currently are doing it. For large files, that
will require the string to do re-allocations and copies, slowing
the process down.
One way around this is to get the size of the file and resize the
string to that size once.
Code:
int read_file_into_string(const std::string & dataFile, std::string &readData)
{
std::ifstream ifile(dataFile.c_str() , std::ios::binary);
if (!ifile) //check if there is any error
{
std::cout << "Error: impossible to open the file \"" << dataFile << "\"" << std::endl;
return -1;
}
else
{
ifile.seekg(0,std::ios::end);
size_t length = ifile.tellg();
ifile.seekg(0,std::ios::beg);
readData.resize(length,0);
// include <algorithm> and include <iterator> for next line
std::copy( std::istreambuf_iterator< char >(ifile) ,
std::istreambuf_iterator< char >() ,
readData.begin() );
}
return 0;
}
Wow, I tried that on a 3 MB file and you are absolutely right. Then it works like a vector resizing itself everytime I append something...nice to know that! THANK YOU
-
March 24th, 2009, 10:22 AM
#6
Re: File stream and size
Well,
now I am really working with vectors and I have a runtime error:
Code:
First-chance exception in My_Program.exe: 0xC0000005: Access Violation.
I guess it's a memory access problem. In fact I have(want) to use some vectors and everytime I store something(many times) I guess they resize themselves consuming time and memory. I don't know the real size from the start so the vectors resize themselves all the time I push_back some element. Is there a way to avoid or reduce this problem? I guess the resize or the reserve functions should work. Which one should I use, and why?
Ty for the answers
-
March 24th, 2009, 10:43 AM
#7
Re: File stream and size
As long as you don't know the maximum expected size of the vector, it's very unlikely that the compiler does. Neither reserve(), nor resize() will save you in this very case. So when you push_back, it may reallocate, that's it. And that's why you're using vector, right?
If you don't know the maximum expected size, the best I can think of is to reserve some "expected" size and reduce reallocations this way. You know it best how much "expected" should be - it shouldn't be that much, because you're wasting memory and speed, and it shouldn't be that little, because you want to reduce reallocations anyway. So it's all up to you.
About the runtime error - I suppose you use operator[] to access some memory, that's actually not allocated for you. For example, you have four elements in the vector, and you're trying to index the ninth. Or something like this... This usually happens when when you try to "push back" an element this way: v[8] = 5; when you don't have the vector properly resized.
Probably it will be best for you if you first .reserve() the "expected" size and then add elements using push_back(). You must not index using operator[] when you're not sure that the vector is properly resized. Or if you want to, make sure that the index is less than .size().
A few words about reserve() and resize()... Calling reserve() only allocates memory to avoid further reallocations. It doesn't change the size of the vector, it doesn't call any constructors. resize() on the other hand changes the size of the vector, allocates memory, and call all the constructors. If you plan to resize the vector, so that you'll be able to change it indexing the elements like this: v[8] = 5; you should use resize(). If you plan to add elements using push_back() and you want to reduce reallocations, then use reserve().
Last edited by yzaykov; March 24th, 2009 at 11:07 AM.
-
March 24th, 2009, 01:12 PM
#8
Re: File stream and size
Originally Posted by yzaykov
As long as you don't know the maximum expected size of the vector, it's very unlikely that the compiler does. Neither reserve(), nor resize() will save you in this very case. So when you push_back, it may reallocate, that's it. And that's why you're using vector, right?
If you don't know the maximum expected size, the best I can think of is to reserve some "expected" size and reduce reallocations this way. You know it best how much "expected" should be - it shouldn't be that much, because you're wasting memory and speed, and it shouldn't be that little, because you want to reduce reallocations anyway. So it's all up to you.
About the runtime error - I suppose you use operator[] to access some memory, that's actually not allocated for you. For example, you have four elements in the vector, and you're trying to index the ninth. Or something like this... This usually happens when when you try to "push back" an element this way: v[8] = 5; when you don't have the vector properly resized.
Probably it will be best for you if you first .reserve() the "expected" size and then add elements using push_back(). You must not index using operator[] when you're not sure that the vector is properly resized. Or if you want to, make sure that the index is less than .size().
A few words about reserve() and resize()... Calling reserve() only allocates memory to avoid further reallocations. It doesn't change the size of the vector, it doesn't call any constructors. resize() on the other hand changes the size of the vector, allocates memory, and call all the constructors. If you plan to resize the vector, so that you'll be able to change it indexing the elements like this: v[8] = 5; you should use resize(). If you plan to add elements using push_back() and you want to reduce reallocations, then use reserve().
Ok, in the end I used reserve, and moreover I don't need to keep all the data...the latter saved a lot of memory and avoided the runtime error. Just for the moment I will use reserve with some "good" size, when I am sure of the average dimensions of the vectors and if these vectors are really useful (it depends on many variables) I will provide to correct the code.
Anyway, thank you again.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|