I've spent most of the afternoon puzzling this one out, and now I've found the solution I thought I'd share it.
The background:
I am reading in a file of data that is read directly into a set of structures. One of the fields in the structure is of variable length, and to accomodate this, the length of the field is stored in the file along with the rest of the serialised data.
The structures all have overloaded istream >> operators.
The operator that reads the variable length field first reads the length and then loops the required number of times to read in the rest of the data.
To read the data into the vector I 'call 'assign' passing 'istream_iterators'.
This did not work as planned
What was occuring was that the initialisation of the istream_iterator would read the first structure from the file. (I didn't expect that!)
The 'assign' call would then repeat this action, passing a reference to the same element in the vector as the first time. But now the file pointer had passed the end of the file and failed to read anything.
The length variable was declared as a local variable in the >> operator and so this time was uninitialised, thus causing a problem.
The solution required an EOF test after reading the length, to leave the element passed to it undisturbed.
This was not how I expected istream_iterator to function and maybe it could catch others out, so I though it may be instructive to mention it here. This was tested on Visual Studio 2005 2008 & 2010.
This problem only occurs if the >> operator is read to a local variable.
To try it out for yourself, I've posted demo code below and attached a test file.
Code:
#include <fstream>
#include <iterator>
#include <vector>
using namespace std;
// Coordinate
struct Coordinate
{
int x;
int y;
};
// Test
struct Test
{
Coordinate coords[2];
};
// How to read Coordinate from a stream.
istream& operator >>(istream& is, Coordinate& coordinate)
{
is >> coordinate.x >> coordinate.y;
return is;
}
// How to read Test from a stream.
istream& operator >>(istream& is, Test& test)
{
int length;
is >> length;
// A check for EOF is required here!
for (int i = 0; i < length; ++i)
{
is >> test.coords[i];
}
return is;
}
// Main.
int main()
{
vector<Test> data;
ifstream file("Test.txt");
istream_iterator<Test> first(file); // Reads the first 'Test' from the file!
istream_iterator<Test> last;
data.assign(first, last);
return 0;
}
"It doesn't matter how beautiful your theory is, it doesn't matter how smart you are. If it doesn't agree with experiment, it's wrong."
Richard P. Feynman
Hi John, I must admit, I don't think I understand what you are saying. The code you posted (without adding an eof check) behaves how I would expect on VS2008... I end up with a vector that has contains a single test structure, that itself has two coordinates ( [0,1] and [2,3] ).
If I add an eof check where you specify, then the behavior does not change. Have I missed something?
EDIT:
The only thing I would add is that failing to initialise length to zero inside your operator >> could cause some undefined behavior the second time round.
Last edited by PredicateNormative; January 10th, 2012 at 06:55 AM.
The only thing I would add is that failing to initialise length to zero inside your operator >> could cause some undefined behavior the second time round.
Yes, that also cures it.
The mistake I had made was that I was not expecting the istream_iterator to read the file at all, and so didn't expect that there would be a 'second time around'.
It's probably just my lack of experience with using istream_iterator, but I found the 'bug' initially counter intuitive.
"It doesn't matter how beautiful your theory is, it doesn't matter how smart you are. If it doesn't agree with experiment, it's wrong."
Richard P. Feynman
* The Best Reasons to Target Windows 8
Learn some of the best reasons why you should seriously consider bringing your Android mobile development expertise to bear on the Windows 8 platform.