structure alignment on x86_64
CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 12 of 12

Thread: structure alignment on x86_64

  1. #1
    Join Date
    Jun 2009
    Posts
    118

    Question structure alignment on x86_64

    Hi All,

    I want to define a C-struct as shown below, which will be written to a message queue:

    Code:
    struct msg
    {
        struct in_addr addr1, addr2;  // 4-byte
        u_int32_t s;             //4-byte
        u_int32_t a;             //4-byte
        u_char f;              //1-byte
        u_short sp;             //2-byte
        u_short dp;             //2-byte
        u_short w;               //2-byte
        u_char myArray[1500]; 
    };
    Based on what I read about structure alignment, on the other side of a message queue, such a definition may lead to "wrong" reads of messages due to structure alignment carried out by compiler (I use gcc by the way).

    Following the guidelines below:

    Code:
        *  Single byte numbers can be aligned at any address
        * Two byte numbers should be aligned to a two byte boundary
        * Four byte numbers should be aligned to a four byte boundary
        * Structures between 1 and 4 bytes of data should be padded so that the total structure is 4 bytes.
        * Structures between 5 and 8 bytes of data should be padded so that the total structure is 8 bytes.
        * Structures between 9 and 16 bytes of data should be padded so that the total structure is 16 bytes.
        * Structures greater than 16 bytes should be padded to 16 byte boundary.
    I came up with following alternative definition:

    Code:
    struct msg
    {
        struct in_addr addr1, addr2;  // 4-byte
        u_int32_t s;             //4-byte
        u_int32_t a;             //4-byte
        u_char f;              //1-byte
        u_char pad1;      //1-byte padding
        u_short sp;             //2-byte
        u_short dp;             //2-byte
        u_short w;               //2-byte
        u_char pad2[8];     //8-byte padding
        u_char myArray[1520]; //additional 20 byte for 16-byte alignment 
    };
    On x86_64, is the structure definition above correct ?

    Am I doing things right ?

    Thanks.

  2. #2
    Join Date
    Jun 2009
    Location
    France
    Posts
    2,292

    Re: structure alignment on x86_64

    Well for starters, you need to guarantee that things like u_short is indeed 2 bytes long. I don't know what your u_short is, but if it is just an alias on the type unsigned short, it could be 4 bytes long.

    An option would be to use the low level C bit fields:

    Code:
    struct msg
    {
         //struct in_addr addr1, addr2;  // 4-byte
        u_int32_t s : 32;             //4-byte
        u_int32_t a : 32;             //4-byte
        u_char f : 8;              //1-byte
        u_int32_t : 8;      //1-byte padding //make anonymous, no name
        //u_char : 0; //This is equivalent, and tells the compiler to pad to the next alignment. This might be better, as your compiler knows if 4 byte or 8 byte is better. However, you might lose the portability you are looking for.
        u_short sp : 16;             //2-byte
        u_short dp : 16;             //2-byte
        u_short w : 16;               //2-byte
        u_int32_t : 64;     //8-byte padding //again, anonymous. You might need to use a u_long64_t though.
        u_char myArray[1500]; //Cross your fingers this is aligned
        u_int32_t : 0; //Final padding. This is always implicit, but might as well write it, for the future.
    }
    The above method should GUARATEE, things are where they belong, as long as the "receiving" architecture is 8 bits per byte octet architecture (almost guaranteed, unless you are on an embedded platform). If it isn't you are screwed anyways.

    I'm not sure if you can use this on the in_addr struct. Probably can, but I'm not sure, and I would not know their size, so I will let you work things out.

    If you need more information on this, lookup "C bit field".
    Another thing you want to look up, is the more general "C++ binary serialization".

    That is all I can help you with, I'm a bit knowledgeable on this stuff, but not that knowledgeable.

  3. #3
    Join Date
    Jan 2009
    Posts
    1,689

    Re: structure alignment on x86_64

    Forgive me if I'm wrong, but won't:

    Code:
    extern "C"{
    struct msg
    {
        struct in_addr addr1, addr2;  // 4-byte
        u_int32_t s;             //4-byte
        u_int32_t a;             //4-byte
        u_char f;              //1-byte
        u_char pad1;      //1-byte padding
        u_short sp;             //2-byte
        u_short dp;             //2-byte
        u_short w;               //2-byte
        u_char pad2[8];     //8-byte padding
        u_char myArray[1520]; //additional 20 byte for 16-byte alignment 
    };
    }
    Force it to not do alignment optimization? (Assuming the primitives are the size that you say they are)

  4. #4
    Join Date
    Jun 2009
    Posts
    118

    Re: structure alignment on x86_64

    Quote Originally Posted by monarch_dodra View Post
    Well for starters, you need to guarantee that things like u_short is indeed 2 bytes long. I don't know what your u_short is, but if it is just an alias on the type unsigned short, it could be 4 bytes long.

    An option would be to use the low level C bit fields:

    Code:
    struct msg
    {
         //struct in_addr addr1, addr2;  // 4-byte
        u_int32_t s : 32;             //4-byte
        u_int32_t a : 32;             //4-byte
        u_char f : 8;              //1-byte
        u_int32_t : 8;      //1-byte padding //make anonymous, no name
        //u_char : 0; //This is equivalent, and tells the compiler to pad to the next alignment. This might be better, as your compiler knows if 4 byte or 8 byte is better. However, you might lose the portability you are looking for.
        u_short sp : 16;             //2-byte
        u_short dp : 16;             //2-byte
        u_short w : 16;               //2-byte
        u_int32_t : 64;     //8-byte padding //again, anonymous. You might need to use a u_long64_t though.
        u_char myArray[1500]; //Cross your fingers this is aligned
        u_int32_t : 0; //Final padding. This is always implicit, but might as well write it, for the future.
    }
    The above method should GUARATEE, things are where they belong, as long as the "receiving" architecture is 8 bits per byte octet architecture (almost guaranteed, unless you are on an embedded platform). If it isn't you are screwed anyways.

    I'm not sure if you can use this on the in_addr struct. Probably can, but I'm not sure, and I would not know their size, so I will let you work things out.

    If you need more information on this, lookup "C bit field".
    Another thing you want to look up, is the more general "C++ binary serialization".

    That is all I can help you with, I'm a bit knowledgeable on this stuff, but not that knowledgeable.
    All the sizes for data types that I listed in my posting have been verified to be correctby using sizeof() statements.

  5. #5
    Join Date
    Jun 2009
    Posts
    118

    Re: structure alignment on x86_64

    Let me clarify the reason why I posted this thread.

    Assuming we have following "u_char" binary data (call it myData) with hex representation of size 23 bytes :

    Code:
    55 11 93 2b 55 6a 56 5c 60 2d 98 ce 6c 8c 10 4f 10 00 50 11 f7 1a 38
    After casting myData into following C-structure:

    Code:
    struct msg
    {
        struct in_addr ipSrc, ipDst;
        u_int32_t thSeq;
        u_int32_t thAck;
        u_char thFlags;
    
        u_short thSport;
        u_short thDport;
    
        u_short thWin;
    
        u_char data[1500];
    };
    
    struct msg* myMsg = (struct msg*)myData;
    I CAN NOT read the values thSport, thDport and thWin back from myMsg. myMsg->thSport, myMsg->thDport and myMsg->thWin simply lists "meaningless" values instead of 00 50, 11 f7 and 1a 38 respectively.

    However, I can read ipSrc (55 11 93 2b), ipDst ( 55 6a 56 5c), thSeq (60 2d 98 ce), thAck ( 6c 8c 10 4f) and thFlags (10) succesfully from myMsg.

    To me, this seems to be a structure alignment problem.

    Any ideas ?

  6. #6
    Join Date
    Jul 2002
    Location
    Portsmouth. United Kingdom
    Posts
    2,722

    Re: structure alignment on x86_64

    Casting pointers to data into structures is always fraught with peril, being so dependant on compiler settings/version/vendor/endianess. The only really safe way is to create serialisation functions to convert known data formats.
    "It doesn't matter how beautiful your theory is, it doesn't matter how smart you are. If it doesn't agree with experiment, it's wrong."
    Richard P. Feynman

  7. #7
    Join Date
    Jun 2009
    Location
    France
    Posts
    2,292

    Re: structure alignment on x86_64

    Quote Originally Posted by aryan1 View Post
    To me, this seems to be a structure alignment problem.
    The problem is that you made an assumption on how your compiler aligned your data, and you were wrong.

    There is a 1 byte padding after thFlags, , and before thSport. You made the mistake of supposing it was after thwin.

    If you went on to read your array, you would have had more garbage in your data struct too (acutally, an offset, but wrong is wrong)

    You can either try to "outguess" your compiler, or take my advice and be explicit about your struct.

    Code:
    struct in_addr
    {
        int addr;
    };
    
    struct msg
    {
        in_addr ipSrc;
        unsigned int : 0; //next align
    
        in_addr ipDst;    //8
        unsigned int : 0; //next align
    
        unsigned int thSeq : 32;    //4
        unsigned int thAck : 32;    //4
        unsigned char thFlags : 8; //1
        unsigned int : 8; //1
    
        unsigned short thSport : 16;  //2
        unsigned short  thDport : 16; //2
    
        unsigned short  thWin : 16;   //2
    
        unsigned int : 0; //next align
    
        unsigned char data[1500]; //1500
    };
    
    int main()
    {
        unsigned char values[1524] = {
            55, 0, 0, 0,  //ipSrc, next value is aligned
            56, 0, 0, 0,  //ipDst, next value is aligned
            57, 0, 0, 0,  //thSeq
            58, 0, 0, 0,  //thAck
            'a',             //thFlags
                0,          //This is what you missed. I know it is there, and further more, I forced my compiler to put it there.
                   05, 0, //thSport
            05, 0,        //thDport
                   05, 0, //thWin, next value is aligned
            'a',  'b',  'c', 'd'  //data
        };
    
        msg& aMsg = reinterpret_cast<msg&>(values);
    
        std::cout << aMsg.ipSrc.addr << std::endl;
        std::cout << aMsg.ipDst.addr << std::endl;
        std::cout << aMsg.thSeq << std::endl;
        std::cout << aMsg.thAck << std::endl;
        std::cout << aMsg.thFlags << std::endl;
        std::cout << aMsg.thSport << std::endl;
        std::cout << aMsg.thDport << std::endl;
        std::cout << aMsg.thWin << std::endl;
        std::cout << aMsg.data[0] << std::endl;
        std::cout << aMsg.data[1] << std::endl;
        std::cout << aMsg.data[2] << std::endl;
        std::cout << aMsg.data[3] << std::endl;
    }
    In the above example, you will notice that if you remove all the bitfields, it works too. But if you put them in, it is guaranteed to work.

    Don't try to guess. Guarantee it.

    EDIT: On a side note, you should give up on data alignment and take Paul's (and mine) advice of explicitly serializing the data into a known format.
    Last edited by monarch_dodra; April 7th, 2010 at 09:58 AM.

  8. #8
    Join Date
    Apr 1999
    Posts
    27,427

    Re: structure alignment on x86_64

    Quote Originally Posted by monarch_dodra View Post
    EDIT: On a side note, you should give up on data alignment and take Paul's (and mine) advice of explicitly serializing the data into a known format.
    Nope, that wasn't me this time.

    But I agree that it would be better if the data were serialized, instead of taking the cheap, error-prone approach of writing from or reading into a struct, where you are at the mercy of the compiler.

    Regards,

    Paul McKenzie

  9. #9
    Join Date
    Apr 1999
    Posts
    27,427

    Re: structure alignment on x86_64

    Quote Originally Posted by aryan1 View Post
    Hi All,

    I want to define a C-struct as shown below, which will be written to a message queue:

    Code:
    struct msg
    {
        struct in_addr addr1, addr2;  // 4-byte
        u_int32_t s;             //4-byte
        u_int32_t a;             //4-byte
        u_char f;              //1-byte
        u_short sp;             //2-byte
        u_short dp;             //2-byte
        u_short w;               //2-byte
        u_char myArray[1500]; 
    };
    Based on what I read about structure alignment, on the other side of a message queue, such a definition may lead to "wrong" reads of messages due to structure alignment carried out by compiler (I use gcc by the way).

    Following the guidelines below:

    Code:
        *  Single byte numbers can be aligned at any address
        * Two byte numbers should be aligned to a two byte boundary
        * Four byte numbers should be aligned to a four byte boundary
        * Structures between 1 and 4 bytes of data should be padded so that the total structure is 4 bytes.
        * Structures between 5 and 8 bytes of data should be padded so that the total structure is 8 bytes.
        * Structures between 9 and 16 bytes of data should be padded so that the total structure is 16 bytes.
        * Structures greater than 16 bytes should be padded to 16 byte boundary.
    I came up with following alternative definition:

    Code:
    struct msg
    {
        struct in_addr addr1, addr2;  // 4-byte
        u_int32_t s;             //4-byte
        u_int32_t a;             //4-byte
        u_char f;              //1-byte
        u_char pad1;      //1-byte padding
        u_short sp;             //2-byte
        u_short dp;             //2-byte
        u_short w;               //2-byte
        u_char pad2[8];     //8-byte padding
        u_char myArray[1520]; //additional 20 byte for 16-byte alignment 
    };
    On x86_64, is the structure definition above correct ?

    Am I doing things right ?

    Thanks.
    The struct is for the purposes of grouping the data together in a logical fashion -- it shouldn't be used as an argument to fwrite(), fread() or whatever function you used to write/read the data. Instead proper serialization should be used.

    By proper serialization, you take each member of your struct, and write it to the file in a consistent manner. And on the read, you create an empty structure, and fill it in, one member at a time, with the contents of the file.

    Doing a "bulk" write or read of the raw struct is full of danger, even though you see a lot of tutorials showing this being done -- in real practice, don't do it unless you can guarantee you will be using the same compiler, same version of the compiler, same OS, same padding, same compiler options, for the lifetime of your application.

    Regards,

    Paul McKenzie

  10. #10
    Join Date
    Jun 2009
    Location
    France
    Posts
    2,292

    Re: structure alignment on x86_64

    Quote Originally Posted by Paul McKenzie View Post
    Nope, that wasn't me this time.
    Terribly sorry. I must have been reading another thread you had replied to. I'll be more careful next time. It's no big deal here, but kind of inconsiderate. I'll avoid putting words in your mouth again.


  11. #11
    Join Date
    Apr 1999
    Posts
    27,427

    Re: structure alignment on x86_64

    Quote Originally Posted by monarch_dodra View Post
    Terribly sorry. I must have been reading another thread you had replied to. I'll be more careful next time. It's no big deal here, but kind of inconsiderate. I'll avoid putting words in your mouth again.

    No problem. No need to apologize.

    Regards,

    Paul McKenzie

  12. #12
    Join Date
    Jul 2002
    Location
    Portsmouth. United Kingdom
    Posts
    2,722

    Re: structure alignment on x86_64

    Quote Originally Posted by monarch_dodra View Post
    Terribly sorry. I must have been reading another thread you had replied to.
    Or my reply to this thread
    http://www.codeguru.com/forum/showpo...20&postcount=6
    "It doesn't matter how beautiful your theory is, it doesn't matter how smart you are. If it doesn't agree with experiment, it's wrong."
    Richard P. Feynman

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Windows Mobile Development Center


Click Here to Expand Forum to Full Width

This is a CodeGuru survey question.


Featured


HTML5 Development Center