CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 13 of 13

Thread: sizeof(char)

  1. #1
    Join Date
    Jun 2002
    Location
    Germany
    Posts
    1,557

    Question sizeof(char)

    Gurus,

    Is sizeof(char) == 1 specified in ANSI C/C++ or can there be compiler-dependent aspects?

    Chris.

    Code:
    char* pc = reinterpret_cast<char*>(&mem_area);
    
    pc += 128;
    You're gonna go blind staring into that box all day.

  2. #2
    Join Date
    May 2000
    Location
    KY, USA
    Posts
    18,652
    They are not specified by ANSI. The size of the standard datatypes are dependent on the specific machine. Most-likely a 'char' will be always one byte nowadays but it is not guaranteed...

  3. #3
    Join Date
    Oct 2002
    Location
    Florida
    Posts
    33
    sizeof (char) = 1. I've used DSP processors (TMS320C44 et. al) that are strict 32-bit engines (no byte selects in the hardware). For these processors, the sizeof (char) = sizeof (int) = sizeof (long) = 1, because all are implemented in 32-bits.

    We got burned trying to port some algorithms from a DOS (Intel) application because the memcpy routines that were used assumed that a "long" was four times greater than a "char". We rewrote these routines to observe the ratio of

    sizeof (long) / sizeof (char)

  4. #4
    Join Date
    Mar 2002
    Location
    California
    Posts
    1,582
    Bjarne Stroustrup, The C++ Programming Language, section 4.6:
    Sizes of C++ objects are expressed in terms of multiples of the size of a char, so by definition, the size of a char is 1.
    Jeff

  5. #5
    Join Date
    May 2000
    Location
    KY, USA
    Posts
    18,652
    Originally posted by jfaust
    Bjarne Stroustrup, The C++ Programming Language, section 4.6:
    Sizes of C++ objects are expressed in terms of multiples of the size of a char, so by definition, the size of a char is 1.
    Jeff
    Well...yes and no. In the same chapter some sentences later...

    "Additionally it will be guaranteed that a 'char' has at least 8 bit, a 'short' at least 16 bit..."

    There exists also machines where a 'char' consists of 32 bytes, as he also mentions later...
    Last edited by Andreas Masur; October 17th, 2002 at 04:25 PM.

  6. #6
    Join Date
    Mar 2002
    Location
    California
    Posts
    1,582
    A char can be of different sizes, as far as bits are concerned, but in all cases, sizeof(char) will equal 1. This may be what you were saying, in which case I'm merely expanding on your point.

    Jeff

  7. #7
    Join Date
    Jun 2002
    Posts
    1,417
    Originally posted by BaroloMan
    For these processors, the sizeof (char) = sizeof (int) = sizeof (long) = 1, because all are implemented in 32-bits.
    That is interesting -- It that correct -- for a 32-bit char, the sizeof operator returns 1?? I'll bet that would break a whole lot of programs written for PC and ported to that operating system.

  8. #8
    Join Date
    Mar 2002
    Location
    California
    Posts
    1,582
    The size of a 32-bit char is one. There's nothing else to return.

    Some more from Stroustrup:

    This is what is guaranteed about sizes of fundamental types:
    1 = sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long)
    1 <= sizeof(bool) <= sizeof(long)
    sizeof(char) <= sizeof(wchar_t) <= sizeof(long)
    sizeof(float) <= sizeof(double) <= sizeof(long double)
    sizeof(N) = sizeof(signed N) = sizeof(unsigned N)
    So, having char int and long all the same size is perfectly legal. Any code that breaks due to this is in error, since it is not standard compliant.

    Jeff

  9. #9
    Join Date
    May 2000
    Location
    KY, USA
    Posts
    18,652
    Originally posted by jfaust
    This may be what you were saying, in which case I'm merely expanding on your point.

    Jeff
    Errrm....yes that is what I wanted to say basically. I mixed up a little bit between the actual size and what 'sizeof()' is supposed to return...it looks like I should stop writing posts after 10 pm....

    Thank you for paying attention...
    Last edited by Andreas Masur; October 18th, 2002 at 12:51 AM.

  10. #10
    Join Date
    Jun 2002
    Posts
    1,417
    Doesn't that make it very difficult, if not impossible, to transfer data from one OS to another? If, in a socket program, the os on one end sends an 8-bit character how would that be received by the program running on the other end that expects a 32-bit character? And how about data files, are they affected by this too?

  11. #11
    Join Date
    Oct 2002
    Location
    Florida
    Posts
    33
    To stober,

    The sizeof operator deals with the storage size for a given type. When one operating system sends a packet to another operating system, each data characters is defined as eight-bits, according to the protocol specifications. It just turns out that the system that declares a char to be 32-bits "wastes" 24-bits of storage for each character it internally manages, because the compiler design properly determined that processor utilization was more important memory utilization.

  12. #12
    Join Date
    Jun 2002
    Location
    Germany
    Posts
    1,557

    The size of a man...

    Gurus,

    I studied up on this one a bit.

    ISO/IEC 9899 "Programming Languages -- C" specifies that that the minimum size of char is 8 bits. The internal storage size of a character may be larger.

    I have concluded that sizeof(char) is compiler-dependent. Fortunately, I think most common compilers for most common platforms store char's in single bytes.

    The interesting topics relating to communications software must certainly be handled within the protocol definitinos and implementations (interesting comments from BaroloMan, stober).

    The sizeof(char) topic always seems to arise when one wants to access the single bytes of some larger data type as individual, adjacent characters. Look at the following code sequence it would seem that the 4 bytes of the float will be properly stored in the character array. However, this code is not proper since it's proper function hinges on the assumption that characters are 1 byte in size.

    I have never found a platform-independent implementation for this kind of function. One could static_cast the address to a DOWRD beforehand, then increment by one, etc. However, this relies on the fact that addresses are less than or equal to 32-bit. I think this is one of the several dark-alleys of C/C++.

    Chris.

    Code:
    int main(int argc, char* argv[])
    {
      float the_float = static_cast<float>(1.23456789);
    
      char float_data[4] =
      {
        *(reinterpret_cast<char*>(&the_float) + 0),
        *(reinterpret_cast<char*>(&the_float) + 1),
        *(reinterpret_cast<char*>(&the_float) + 2),
        *(reinterpret_cast<char*>(&the_float) + 3)
      };
    
      return 1;
    }
    You're gonna go blind staring into that box all day.

  13. #13
    Join Date
    Oct 2002
    Location
    Florida
    Posts
    33
    dude_1967:

    The only correct conclusion is sizeof(char) = 1. Please review all comments by jfaust.

    The confusion may stem from the fact that the overwhelming majority of processors support byte accesses of memory, so the overwhelming majority of compilers implement one char per byte. However the sizeof operator does not literally refer to the number of physical bytes.

    If a compiler does not evaluate sizeof(char) to 1, it is non-compliant.

    About your example of looking at components of a float via multiple chars, lets talk about little endian/big endian

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured