serialization

**mop65715** · April 6th, 2006, 04:47 PM

I'm trying to work my way through serialization which is quite interesting to say the least. So now consider

Code:

   struct test { int idx; };

Lets assume I did:

Code:

   test tp;
   tp.idx = 45;

I'm using a serializer which essentially turns 4 bytes into 8 bytes (4 bytes for the value 4 and 4 bytes for the value 5) . To to this the data is serialized via std::string and a stringstream is used like so:

Code:

   serializer  << tp;  // std::string stuff...
   std::stringstream s_str;
   s_str << tp;
   socket s;
    s.send ( s_str.str().c_str(), s_str.str().size() )  // assume char* and int len for arguments.

So that works. The problem I have surrounds the fact that
1. How do I know how many bytes to expect on the receiving end?

For instance: Today I do something akin to:

Code:

   char buffer [ 0x1000 ];
   int len =  s.receive ( buffer, 0x1000 );
   if ( len > 0 ) 
   {}

In effect allocate a buffer large enough. Trouble is what if I'm receiving partial packets. What if sizeof (test) was some very big number and I receive test in 'two transmittals' - if you will. You see with serialization, I can't do things like:

Code:

    int len = s.receive ( buffer, sizeof ( test ) );

As a result I'm confused on what to do in that scenario.

2.
How does floats and doubles get serialized? I could see.

Code:

  struct test { int jdx; };
   test p; 
   p.jdx = 555; // this euqates to 12 bytes
   p.jdx = 44444; // this equatest to 24 bytes

etc.

Now given

Code:

 struct testt { float jdx;  }; struct testd { double jdx; };

 testt f;
 f.float = 3.14;

This amounts to 9 bytes.

For double

Code:

  testd d;
  d.jdx = 3.1415927; // amounts to 16 bytes.

Puzzling since I'm not sure how. The values - 12, 24, 9 and 16 repectively are the numbers seen during debug.

**DragForce** · April 6th, 2006, 05:50 PM

You are in the very beginning of your troubles yet

. Have you considered serialisation of pointers to objects, containers, collections of polymorphic objects? What will happen if you modify a class and chage the data which is to be serialised?

These are not all questions which you may be asking yourself/forum shortly. If you wish to avoid such kind of problems you may use special libraries, such as boost::serialisation.

The simplest ways to resolve your problem is to serialise everything you wish in a single string, find it length. then send initial packet with information about the size of the following transmission. After that the server will know that a string is coming and it knows about its size, so it can accept that string. Afterwards it can be desirialised into your objects.

**mop65715** · April 6th, 2006, 06:16 PM

DragForce
You are in the very beginning of your troubles yet . Have you considered serialisation of pointers to objects, containers, collections of polymorphic objects? What will happen if you modify a class and chage the data which is to be serialised?

Uhmnn .. Yeah .. serialization of polymorphic containers. Oh my. That just sounds like trouble to me

DragForce
After that the server will know that a string is coming and it knows about its size, so it can accept that string.

Here's how I interpret this. Correct me if I'm wrong. Assume I'm a server. Assume you're a client.
1. You just sent me 'a composite type called "stuff".
2. Within stuff you have a member - call it size.
3. Upon receipt of the 'inital' packet. I would hope/pray I receive 'size'.
4. So assume size if 4K bytes. Assume I receive the first 1K.
5. I would 'deserialize' the initial 1K. Pull out 'size'. Store the initial data in a 'temporary' buffer.
6. Await the next 3K.
7. Upon receipt of said 3K append it to this 'temporary' buffer.
8. Deserialize the whole thing now.

Whalaa I'm done... the only issue here is size (item 2) needs to be at the front end.

**DragForce** · April 7th, 2006, 01:25 AM

1. The first packet of data sent to server should contain only auxilary information, e.g. number of following packets and their sizes.
2. After receiing this packet the server should send a confirmation to the client that it is ready to accept everything else.
3. As soon as client knows that server is ready it sends its packets.
4. Server accumulate transmitted data, until all packets of the transission are successfuly received. And only after that it deserialises.

**mop65715** · April 7th, 2006, 09:46 AM

Originally Posted by DragForce

1. The first packet of data sent to server should contain only auxilary information, e.g. number of following packets and their sizes.
2. After receiing this packet the server should send a confirmation to the client that it is ready to accept everything else.
3. As soon as client knows that server is ready it sends its packets.
4. Server accumulate transmitted data, until all packets of the transission are successfuly received. And only after that it deserialises.

I'm having a little bit of a disconnect. I'm told to de-serialize and de-serialize only once in item 4. Upon receipt of the first packet how do I _know_ about the auxillary information _withouth_ doing some sort of de-serializing on the first packet?

**DragForce** · April 7th, 2006, 10:10 AM

You are mixing two different things: serialisation and transmission. In your program you need clearly distinguish these two abstractions.

Serialisation must not care about packets, sizes, syncronisation, etc. It does only two things: serialises everything you need into a string and restores objects from a string. That is it.

Transmission module must not know about the kind of information it is transmitting. It must have simple interface: pass a string to the server (get a string from a client). Internally it may implement some kind of protocol, split data into packets, etc.

Both abstractions have to be implemented separately and independently

**NMTop40** · April 7th, 2006, 10:36 AM

you have 2 choices:

- read a header and then know in advance how many bytes to read.
- read until you get a "terminator".

In the case of a nul-terminated string, if you send the nul character as well you will know where the terminator is.

In the case of XML, you will know by parsing where the close XML tag is (assuming you are guaranteed to get valid XML).

Note that with TCP/IP there is no guarantee that you will get a whole message at a time. Order is guaranteed though.

**mop65715** · April 7th, 2006, 12:14 PM

Originally Posted by NMTop40

- read a header and then know in advance how many bytes to read.
- read until you get a "terminator".

That's what I thought.

Originally Posted by NMTop40

Note that with TCP/IP there is no guarantee that you will get a whole message at a time.

That's what I wat told. Having said that the code below is a recipe for disaster.

Code:

int CommunicatingSocket::recv(void *buffer, int bufferLen)
   throw(SocketException) {
 int rtn;
 if ((rtn = ::recv(sockDesc, (raw_type *) buffer, bufferLen, 0)) < 0) {
   throw SocketException("Received failed (recv())", true);
 }
 return rtn;
}

Meaning, I should have while loop

Code:

    while ( rtn > bufferLen ) 
    {

   }

So that's when I realized I have a problem which brought about this question. It's silly because I'm making assumptions about bufferLen which is not guaranteed to be the same using serialization or other wise. So I'm experiment with 'terminator' with the intent to leave the 'recv' function as is:

Code:

      unsigned int const BUFFER_LEN ( 0x10000 );
      char buffer [ BUFFER_LEN ];
      int  const how_much = sock->recv ( buffer,  BUFFER_LEN ) ;
      if ( how_much > 0 ) 
      {
          // now check to see if I received every thing... via terminator/or size
         //  If I didn't 
         // look at the difference between between how much 
        // and the anticipated amount
         // queue up the old stuff .. loop and wait for the rest..
      }

Makes sense.
Thanks

**Tradone** · April 7th, 2006, 04:32 PM

the boost library has an excellent serialization library.

**mop65715** · April 7th, 2006, 06:03 PM

Originally Posted by Tradone

the boost library has an excellent serialization library.

The one uncertainty I have about boost is the ability to pull out that piece and that piece alone. Do you know if that's possible to pull out the serialization piece and the serialization piece only?

**DragForce** · April 8th, 2006, 09:20 AM

Originally Posted by mop65715

The one uncertainty I have about boost is the ability to pull out that piece and that piece alone. Do you know if that's possible to pull out the serialization piece and the serialization piece only?

This is a rahter strange request. Why do you need to "pull out" anything?

If you are to use boost::serialisation, than just compile release and debug versions of this library (you will get two lib files). Then refer to them in project setings and include required .h files in the source code as desribed in the documentation. The linker will automatically add required functions from serialisation lib file into your executable ("pull out that piece and that piece alone").

**mop65715** · April 8th, 2006, 11:29 AM

Originally Posted by DragForce

This is a rahter strange request. Why do you need to "pull out" anything?

If you are to use boost::serialisation, than just compile release and debug versions of this library (you will get two lib files). Then refer to them in project setings and include required .h files in the source code as desribed in the documentation. The linker will automatically add required functions from serialisation lib file into your executable ("pull out that piece and that piece alone").

I know. That thing is huge. Serialization alone - if memory serves is like 7 MiB. I way trying to determine if there's a way to customize it. I realize I'm asking for too much. This not like buying a new car or ... where I could customize it.

**Paul McKenzie** · April 8th, 2006, 03:10 PM

Originally Posted by mop65715

I know. That thing is huge. Serialization alone - if memory serves is like 7 MiB. I way trying to determine if there's a way to customize it.

How can you customize somthing that you don't even know if it will fit your needs, or how it works? The last thing I would be thinking of is how to customize it. I would get to work using it first.

Also, that is the size of the library, that is not the size of the final executable.

Regards,

Paul McKenzie

**mop65715** · April 8th, 2006, 06:57 PM

Originally Posted by Paul McKenzie

How can you customize somthing that you don't even know if it will fit your needs, or how it works? The last thing I would be thinking of is how to customize it. I would get to work using it first.

Also, that is the size of the library, that is not the size of the final executable.

Howdy Paul.
I'm not sure where you got that impression but I know how _it_ works and I know it will fit my needs ( have used it before ). At issue though is the fact that I'm required to use gcc version 3 at a minimum - for boost stuff. For this application I'm dealing with, I'm (as much as it pains me) _stuck_ with gcc 2.96 so I ended up rolling my own serialization. Hate to reinvent the wheel but I didn't feel like I wanted to fuss with endian.

Yes, I also know that's not the size of the final executable. I was just hoping there was a way to shoe horn it into gcc 2.96 but the simple solution is to upgrade the _old_and_antiquated_ compiler. Trouble is the vendor IDE ships with this _old_and_antiquated_ compiler. As a result if I were to upgraged to their latest IDE ( which uses gcc 3.3.2 - not the best but better ) it'll cost money. The advisor - I don't think - will be too pleased with that.

**DragForce** · April 9th, 2006, 03:31 AM

See other Implementations

Thread: serialization

Thread Tools

Display

serialization

Re: serialization

Re: serialization

Re: serialization

Re: serialization

Re: serialization

Re: serialization

Re: serialization

Re: serialization

Re: serialization

Re: serialization

Re: serialization

Re: serialization

Re: serialization

Re: serialization

Posting Permissions