CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 11 of 11
  1. #1
    Join Date
    Aug 2009
    Posts
    25

    saving an unknown object of POD in binary format

    I have an unkown object which is guaranteed to contain only POD's as
    its member data.
    Do the template functions below work correctly for saving and retrieval of its data from
    a binary file?
    Code:
    template< class T >
    inline void
    write_binary_data( const T& _t,  std::ofstream& out )
    {
    	out.write( (char*)(&_t), sizeof(T) );
    }
    
    template< class T >
    inline void
    read_binary_data( T& _t,  std::ifstream& in )
    {
    	in.read( (char*)(&_t), sizeof(T) );
    }

  2. #2
    Join Date
    Jun 2009
    Location
    France
    Posts
    2,513

    Re: saving an unknown object of POD in binary format

    That NEVER works correctly, regardless of object type. At least, not in a potable fashion.

    Just serialize your objects.
    Is your question related to IO?
    Read this C++ FAQ article at parashift by Marshall Cline. In particular points 1-6.
    It will explain how to correctly deal with IO, how to validate input, and why you shouldn't count on "while(!in.eof())". And it always makes for excellent reading.

  3. #3
    Join Date
    Jul 2005
    Location
    Netherlands
    Posts
    2,042

    Re: saving an unknown object of POD in binary format

    To be exact, when write and read are called from the same compiled executable/library on the same or a similar machine this should ALWAYS work.
    Things start to go wrong when the functions are compiled with different alignment settings or with a different size of the built-in types. This could already happen within the same program if you use multiple libraries. E.g. if something is written from one dll and then read in another that was compiled with different alignment settings, things may go very wrong.

    I'm not sure how endianness plays a role here (I know it does, but not exactly how). I think your program would have to be recompiled for it to function on a machine with different endianness. Someone please correct me if I'm wrong.

    So whether this works or not depends on your use case. However, don't count on this code for portability.
    Cheers, D Drmmr

    Please put [code][/code] tags around your code to preserve indentation and make it more readable.

    As long as man ascribes to himself what is merely a posibility, he will not work for the attainment of it. - P. D. Ouspensky

  4. #4
    Lindley is offline Elite Member Power Poster
    Join Date
    Oct 2007
    Location
    Seattle, WA
    Posts
    10,895

    Re: saving an unknown object of POD in binary format

    The correct solution is to require Ts to implement some form of serialization/deserialization routines. For human-readable output, often this simply means overloading operator>> and operator<<. Many times this is enough; but if you want to also support a more compact binary format----with all the endianness pitfalls you need to watch out for----then requiring it to have methods read() and write() (or serialize() and deserialize() if you prefer) is also an option.

    You may if you wish specify the requirement for these functions via an interface (abstract base class), but if your goal is only to use the methods within template functions, then this isn't necessary.

    It is also possible to use a library like Boost.Serialization, which handles much of the boilerplate automatically.

  5. #5
    Join Date
    Aug 2009
    Posts
    25

    Re: saving an unknown object of POD in binary format

    The binary file used in program is a temporary file that is used during the
    runtime of program. it does not even exist after the end of execution.
    So I think the problems with portability can be neglected.

    But another question raised, if we have an object ob is it guaranteed
    that &ob is the address of beginning of it? or it is a compiler implementation issue?

  6. #6
    Lindley is offline Elite Member Power Poster
    Join Date
    Oct 2007
    Location
    Seattle, WA
    Posts
    10,895

    Re: saving an unknown object of POD in binary format

    Quote Originally Posted by ar115 View Post
    The binary file used in program is a temporary file that is used during the
    runtime of program. it does not even exist after the end of execution.
    So I think the problems with portability can be neglected.
    Well then, it should work for any type not containing owned pointers. If there are pointers in the type, obviously saving and restoring their values makes no sense; you want to save/restore what they point to instead. However, this may not be an issue in your case? That's your call.

    But another question raised, if we have an object ob is it guaranteed
    that &ob is the address of beginning of it? or it is a compiler implementation issue?
    The only time I'd doubt that is if multiple inheritance were involved. Otherwise, yes, that should be the case.

  7. #7
    Join Date
    Aug 2005
    Location
    San Diego, CA
    Posts
    1,054

    Lightbulb Re: saving an unknown object of POD in binary format

    My only comment is that if you are dealing with POD types there is no reason to use template functions. The template function doesn't buy you anything in this case and does not provide any additional portability or type safety. There is no way to prevent that template function from being used with non-POD types. If you call the function with non-pod types the compiler will happily generate the function for you and it may or may not work depending on how trivial the object type is. I don't see the benefit of the template functions in this case. It is just an unnecessary misdirection in my mind since you can easily use reinterpret_cast to convert any pointer to a POD type into the const char* needed by the stream functions. The C++ standard makes a guarantee that POD types are always convertible to and from char*.

    Perhaps you could explain to us in your own words what you think the template function would do for you in this case.

  8. #8
    Join Date
    Aug 2005
    Location
    San Diego, CA
    Posts
    1,054

    Lightbulb Re: saving an unknown object of POD in binary format

    Quote Originally Posted by Lindley View Post
    The correct solution is to require Ts to implement some form of serialization/deserialization routines. For human-readable output, often this simply means overloading operator>> and operator<<. Many times this is enough; but if you want to also support a more compact binary format----with all the endianness pitfalls you need to watch out for----then requiring it to have methods read() and write() (or serialize() and deserialize() if you prefer) is also an option.

    You may if you wish specify the requirement for these functions via an interface (abstract base class), but if your goal is only to use the methods within template functions, then this isn't necessary.

    It is also possible to use a library like Boost.Serialization, which handles much of the boilerplate automatically.
    I'm not so sure about that since the OP did say that he was dealing with POD objects. Perhaps we are making the solution more complex than necessary? Why is the original template function needed since we are only dealing with POD objects? Why not just cast the pointer to the object into const char* and use the stream read/write functions directly? It isn't much different than copying the POD object to and from a character array using memcpy is it?

  9. #9
    Lindley is offline Elite Member Power Poster
    Join Date
    Oct 2007
    Location
    Seattle, WA
    Posts
    10,895

    Re: saving an unknown object of POD in binary format

    Quote Originally Posted by kempofighter View Post
    I'm not so sure about that since the OP did say that he was dealing with POD objects. Perhaps we are making the solution more complex than necessary? Why is the original template function needed since we are only dealing with POD objects? Why not just cast the pointer to the object into const char* and use the stream read/write functions directly? It isn't much different than copying the POD object to and from a character array using memcpy is it?
    Assuming no pointer members as I said above, then yes, it's exactly the same---and probably good enough so long as the file only persists for the length of a single program run.

    However, that solution becomes insufficient if you want a file which can persist between program runs, or more generally, between machines/compilers/options which may be used to build the programs.

  10. #10
    Join Date
    Aug 2005
    Location
    San Diego, CA
    Posts
    1,054

    Re: saving an unknown object of POD in binary format

    Quote Originally Posted by Lindley View Post
    Assuming no pointer members as I said above, then yes, it's exactly the same---and probably good enough so long as the file only persists for the length of a single program run.

    However, that solution becomes insufficient if you want a file which can persist between program runs, or more generally, between machines/compilers/options which may be used to build the programs.
    I don't think that I agree with that even in the case where one program writes to and from a file across different program executions. if the type changes and the program is recompiled then it makes no difference whether you have a serialization routine. If you have changed compiler options such as structure alignment settings or something like that then the existing files can't be successfully read regardless of whether you have a serialization routine. In other words if the program is not yet stable there needs to be some mechanism for determining whether the input file can even be read. Perhaps one would need to check the file size against the type size within the program or something. How would a serialization function help with that problem? I would think that a serialization would only be useful for non-POD user defined types. In that case you have no choice.

  11. #11
    Lindley is offline Elite Member Power Poster
    Join Date
    Oct 2007
    Location
    Seattle, WA
    Posts
    10,895

    Re: saving an unknown object of POD in binary format

    Quote Originally Posted by kempofighter View Post
    I don't think that I agree with that even in the case where one program writes to and from a file across different program executions. if the type changes and the program is recompiled then it makes no difference whether you have a serialization routine. If you have changed compiler options such as structure alignment settings or something like that then the existing files can't be successfully read regardless of whether you have a serialization routine. In other words if the program is not yet stable there needs to be some mechanism for determining whether the input file can even be read. Perhaps one would need to check the file size against the type size within the program or something. How would a serialization function help with that problem? I would think that a serialization would only be useful for non-POD user defined types. In that case you have no choice.
    Consider the struct:
    Code:
    struct Type
    {
        char c;
        double d;
        char c2;
        int i;
    };
    One compiler builds this to pack all elements tightly, while another packs them on 4-byte boundaries. Clearly, this won't result in a file that can be written by one and successfully read by the other:
    Code:
    Type t;
    fwrite(&t, sizeof(t), 1, out);
    But if you write:
    Code:
    void serialize(const Type &t, FILE *out)
    {
        fwrite(&T.c,sizeof(T.c), 1, out);
        fwrite(&T.d,sizeof(T.d), 1, out);
        fwrite(&T.c2,sizeof(T.c2), 1, out);
        fwrite(&T.i, sizeof(T.i), 1, out);
    }
    then assuming you also have a matching deserialize() routine it will work fine, because you're imposing the tightly packed layout on the data in the file, rather than just assuming it.

    Of course, the above won't work if the two compilers assume different sizes for the members; say, one uses 4-byte ints and the other uses 2-byte ints. It also won't properly handle changes in endianness. But it's a start.

    The easiest way to handle those other issues would be to use an ASCII-based format such as
    Code:
    void serialize(const Type &t, FILE *out)
    {
        fprintf("&#37;d %.18g %d %d ",T.c,T.d,T.c2,T.i);
    }
    Here, we're writing c and c2 as ints simply to avoid trouble if they happen to correspond to whitespace characters. Again, of course, a matching deserialize() would be needed. Note I require we write 18 digits for the double; I read somewhere that 18 digits is always enough to reconstruct the exact bit pattern of a double.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured