-
April 7th, 2010, 03:51 AM
#1
structure alignment on x86_64
Hi All,
I want to define a C-struct as shown below, which will be written to a message queue:
Code:
struct msg
{
struct in_addr addr1, addr2; // 4-byte
u_int32_t s; //4-byte
u_int32_t a; //4-byte
u_char f; //1-byte
u_short sp; //2-byte
u_short dp; //2-byte
u_short w; //2-byte
u_char myArray[1500];
};
Based on what I read about structure alignment, on the other side of a message queue, such a definition may lead to "wrong" reads of messages due to structure alignment carried out by compiler (I use gcc by the way).
Following the guidelines below:
Code:
* Single byte numbers can be aligned at any address
* Two byte numbers should be aligned to a two byte boundary
* Four byte numbers should be aligned to a four byte boundary
* Structures between 1 and 4 bytes of data should be padded so that the total structure is 4 bytes.
* Structures between 5 and 8 bytes of data should be padded so that the total structure is 8 bytes.
* Structures between 9 and 16 bytes of data should be padded so that the total structure is 16 bytes.
* Structures greater than 16 bytes should be padded to 16 byte boundary.
I came up with following alternative definition:
Code:
struct msg
{
struct in_addr addr1, addr2; // 4-byte
u_int32_t s; //4-byte
u_int32_t a; //4-byte
u_char f; //1-byte
u_char pad1; //1-byte padding
u_short sp; //2-byte
u_short dp; //2-byte
u_short w; //2-byte
u_char pad2[8]; //8-byte padding
u_char myArray[1520]; //additional 20 byte for 16-byte alignment
};
On x86_64, is the structure definition above correct ?
Am I doing things right ?
Thanks.
-
April 7th, 2010, 07:04 AM
#2
Re: structure alignment on x86_64
Well for starters, you need to guarantee that things like u_short is indeed 2 bytes long. I don't know what your u_short is, but if it is just an alias on the type unsigned short, it could be 4 bytes long.
An option would be to use the low level C bit fields:
Code:
struct msg
{
//struct in_addr addr1, addr2; // 4-byte
u_int32_t s : 32; //4-byte
u_int32_t a : 32; //4-byte
u_char f : 8; //1-byte
u_int32_t : 8; //1-byte padding //make anonymous, no name
//u_char : 0; //This is equivalent, and tells the compiler to pad to the next alignment. This might be better, as your compiler knows if 4 byte or 8 byte is better. However, you might lose the portability you are looking for.
u_short sp : 16; //2-byte
u_short dp : 16; //2-byte
u_short w : 16; //2-byte
u_int32_t : 64; //8-byte padding //again, anonymous. You might need to use a u_long64_t though.
u_char myArray[1500]; //Cross your fingers this is aligned
u_int32_t : 0; //Final padding. This is always implicit, but might as well write it, for the future.
}
The above method should GUARATEE, things are where they belong, as long as the "receiving" architecture is 8 bits per byte octet architecture (almost guaranteed, unless you are on an embedded platform). If it isn't you are screwed anyways.
I'm not sure if you can use this on the in_addr struct. Probably can, but I'm not sure, and I would not know their size, so I will let you work things out.
If you need more information on this, lookup "C bit field".
Another thing you want to look up, is the more general "C++ binary serialization".
That is all I can help you with, I'm a bit knowledgeable on this stuff, but not that knowledgeable.
-
April 7th, 2010, 07:27 AM
#3
Re: structure alignment on x86_64
Forgive me if I'm wrong, but won't:
Code:
extern "C"{
struct msg
{
struct in_addr addr1, addr2; // 4-byte
u_int32_t s; //4-byte
u_int32_t a; //4-byte
u_char f; //1-byte
u_char pad1; //1-byte padding
u_short sp; //2-byte
u_short dp; //2-byte
u_short w; //2-byte
u_char pad2[8]; //8-byte padding
u_char myArray[1520]; //additional 20 byte for 16-byte alignment
};
}
Force it to not do alignment optimization? (Assuming the primitives are the size that you say they are)
-
April 7th, 2010, 07:37 AM
#4
Re: structure alignment on x86_64
Originally Posted by monarch_dodra
Well for starters, you need to guarantee that things like u_short is indeed 2 bytes long. I don't know what your u_short is, but if it is just an alias on the type unsigned short, it could be 4 bytes long.
An option would be to use the low level C bit fields:
Code:
struct msg
{
//struct in_addr addr1, addr2; // 4-byte
u_int32_t s : 32; //4-byte
u_int32_t a : 32; //4-byte
u_char f : 8; //1-byte
u_int32_t : 8; //1-byte padding //make anonymous, no name
//u_char : 0; //This is equivalent, and tells the compiler to pad to the next alignment. This might be better, as your compiler knows if 4 byte or 8 byte is better. However, you might lose the portability you are looking for.
u_short sp : 16; //2-byte
u_short dp : 16; //2-byte
u_short w : 16; //2-byte
u_int32_t : 64; //8-byte padding //again, anonymous. You might need to use a u_long64_t though.
u_char myArray[1500]; //Cross your fingers this is aligned
u_int32_t : 0; //Final padding. This is always implicit, but might as well write it, for the future.
}
The above method should GUARATEE, things are where they belong, as long as the "receiving" architecture is 8 bits per byte octet architecture (almost guaranteed, unless you are on an embedded platform). If it isn't you are screwed anyways.
I'm not sure if you can use this on the in_addr struct. Probably can, but I'm not sure, and I would not know their size, so I will let you work things out.
If you need more information on this, lookup "C bit field".
Another thing you want to look up, is the more general "C++ binary serialization".
That is all I can help you with, I'm a bit knowledgeable on this stuff, but not that knowledgeable.
All the sizes for data types that I listed in my posting have been verified to be correctby using sizeof() statements.
-
April 7th, 2010, 07:51 AM
#5
Re: structure alignment on x86_64
Let me clarify the reason why I posted this thread.
Assuming we have following "u_char" binary data (call it myData) with hex representation of size 23 bytes :
Code:
55 11 93 2b 55 6a 56 5c 60 2d 98 ce 6c 8c 10 4f 10 00 50 11 f7 1a 38
After casting myData into following C-structure:
Code:
struct msg
{
struct in_addr ipSrc, ipDst;
u_int32_t thSeq;
u_int32_t thAck;
u_char thFlags;
u_short thSport;
u_short thDport;
u_short thWin;
u_char data[1500];
};
struct msg* myMsg = (struct msg*)myData;
I CAN NOT read the values thSport, thDport and thWin back from myMsg. myMsg->thSport, myMsg->thDport and myMsg->thWin simply lists "meaningless" values instead of 00 50, 11 f7 and 1a 38 respectively.
However, I can read ipSrc (55 11 93 2b), ipDst ( 55 6a 56 5c), thSeq (60 2d 98 ce), thAck ( 6c 8c 10 4f) and thFlags (10) succesfully from myMsg.
To me, this seems to be a structure alignment problem.
Any ideas ?
-
April 7th, 2010, 08:29 AM
#6
Re: structure alignment on x86_64
Casting pointers to data into structures is always fraught with peril, being so dependant on compiler settings/version/vendor/endianess. The only really safe way is to create serialisation functions to convert known data formats.
"It doesn't matter how beautiful your theory is, it doesn't matter how smart you are. If it doesn't agree with experiment, it's wrong."
Richard P. Feynman
-
April 7th, 2010, 09:24 AM
#7
Re: structure alignment on x86_64
Originally Posted by aryan1
To me, this seems to be a structure alignment problem.
The problem is that you made an assumption on how your compiler aligned your data, and you were wrong.
There is a 1 byte padding after thFlags, , and before thSport. You made the mistake of supposing it was after thwin.
If you went on to read your array, you would have had more garbage in your data struct too (acutally, an offset, but wrong is wrong)
You can either try to "outguess" your compiler, or take my advice and be explicit about your struct.
Code:
struct in_addr
{
int addr;
};
struct msg
{
in_addr ipSrc;
unsigned int : 0; //next align
in_addr ipDst; //8
unsigned int : 0; //next align
unsigned int thSeq : 32; //4
unsigned int thAck : 32; //4
unsigned char thFlags : 8; //1
unsigned int : 8; //1
unsigned short thSport : 16; //2
unsigned short thDport : 16; //2
unsigned short thWin : 16; //2
unsigned int : 0; //next align
unsigned char data[1500]; //1500
};
int main()
{
unsigned char values[1524] = {
55, 0, 0, 0, //ipSrc, next value is aligned
56, 0, 0, 0, //ipDst, next value is aligned
57, 0, 0, 0, //thSeq
58, 0, 0, 0, //thAck
'a', //thFlags
0, //This is what you missed. I know it is there, and further more, I forced my compiler to put it there.
05, 0, //thSport
05, 0, //thDport
05, 0, //thWin, next value is aligned
'a', 'b', 'c', 'd' //data
};
msg& aMsg = reinterpret_cast<msg&>(values);
std::cout << aMsg.ipSrc.addr << std::endl;
std::cout << aMsg.ipDst.addr << std::endl;
std::cout << aMsg.thSeq << std::endl;
std::cout << aMsg.thAck << std::endl;
std::cout << aMsg.thFlags << std::endl;
std::cout << aMsg.thSport << std::endl;
std::cout << aMsg.thDport << std::endl;
std::cout << aMsg.thWin << std::endl;
std::cout << aMsg.data[0] << std::endl;
std::cout << aMsg.data[1] << std::endl;
std::cout << aMsg.data[2] << std::endl;
std::cout << aMsg.data[3] << std::endl;
}
In the above example, you will notice that if you remove all the bitfields, it works too. But if you put them in, it is guaranteed to work.
Don't try to guess. Guarantee it.
EDIT: On a side note, you should give up on data alignment and take Paul's (and mine) advice of explicitly serializing the data into a known format.
Last edited by monarch_dodra; April 7th, 2010 at 09:58 AM.
-
April 7th, 2010, 03:44 PM
#8
Re: structure alignment on x86_64
Originally Posted by monarch_dodra
EDIT: On a side note, you should give up on data alignment and take Paul's (and mine) advice of explicitly serializing the data into a known format.
Nope, that wasn't me this time.
But I agree that it would be better if the data were serialized, instead of taking the cheap, error-prone approach of writing from or reading into a struct, where you are at the mercy of the compiler.
Regards,
Paul McKenzie
-
April 7th, 2010, 03:57 PM
#9
Re: structure alignment on x86_64
Originally Posted by aryan1
Hi All,
I want to define a C-struct as shown below, which will be written to a message queue:
Code:
struct msg
{
struct in_addr addr1, addr2; // 4-byte
u_int32_t s; //4-byte
u_int32_t a; //4-byte
u_char f; //1-byte
u_short sp; //2-byte
u_short dp; //2-byte
u_short w; //2-byte
u_char myArray[1500];
};
Based on what I read about structure alignment, on the other side of a message queue, such a definition may lead to "wrong" reads of messages due to structure alignment carried out by compiler (I use gcc by the way).
Following the guidelines below:
Code:
* Single byte numbers can be aligned at any address
* Two byte numbers should be aligned to a two byte boundary
* Four byte numbers should be aligned to a four byte boundary
* Structures between 1 and 4 bytes of data should be padded so that the total structure is 4 bytes.
* Structures between 5 and 8 bytes of data should be padded so that the total structure is 8 bytes.
* Structures between 9 and 16 bytes of data should be padded so that the total structure is 16 bytes.
* Structures greater than 16 bytes should be padded to 16 byte boundary.
I came up with following alternative definition:
Code:
struct msg
{
struct in_addr addr1, addr2; // 4-byte
u_int32_t s; //4-byte
u_int32_t a; //4-byte
u_char f; //1-byte
u_char pad1; //1-byte padding
u_short sp; //2-byte
u_short dp; //2-byte
u_short w; //2-byte
u_char pad2[8]; //8-byte padding
u_char myArray[1520]; //additional 20 byte for 16-byte alignment
};
On x86_64, is the structure definition above correct ?
Am I doing things right ?
Thanks.
The struct is for the purposes of grouping the data together in a logical fashion -- it shouldn't be used as an argument to fwrite(), fread() or whatever function you used to write/read the data. Instead proper serialization should be used.
By proper serialization, you take each member of your struct, and write it to the file in a consistent manner. And on the read, you create an empty structure, and fill it in, one member at a time, with the contents of the file.
Doing a "bulk" write or read of the raw struct is full of danger, even though you see a lot of tutorials showing this being done -- in real practice, don't do it unless you can guarantee you will be using the same compiler, same version of the compiler, same OS, same padding, same compiler options, for the lifetime of your application.
Regards,
Paul McKenzie
-
April 7th, 2010, 04:08 PM
#10
Re: structure alignment on x86_64
Originally Posted by Paul McKenzie
Nope, that wasn't me this time.
Terribly sorry. I must have been reading another thread you had replied to. I'll be more careful next time. It's no big deal here, but kind of inconsiderate. I'll avoid putting words in your mouth again.
-
April 7th, 2010, 04:14 PM
#11
Re: structure alignment on x86_64
Originally Posted by monarch_dodra
Terribly sorry. I must have been reading another thread you had replied to. I'll be more careful next time. It's no big deal here, but kind of inconsiderate. I'll avoid putting words in your mouth again.
No problem. No need to apologize.
Regards,
Paul McKenzie
-
April 8th, 2010, 02:52 AM
#12
Re: structure alignment on x86_64
Originally Posted by monarch_dodra
Terribly sorry. I must have been reading another thread you had replied to.
Or my reply to this thread
http://www.codeguru.com/forum/showpo...20&postcount=6
"It doesn't matter how beautiful your theory is, it doesn't matter how smart you are. If it doesn't agree with experiment, it's wrong."
Richard P. Feynman
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|