Thoughts on C++ integer types

Printable View

January 8th, 2012, 03:51 PM
Bssldr

Thoughts on C++ integer types

The size of C++ integer types is implementation specific and one should use the types provided by the <cstdint> header for the code to be portable (std::uint32_t etc).

So, when you want to write safe code you're going to use the typedefined types - this means that everyone who doesn't want their code to explode would use these types. What use do I have for the 'pure types' then? Might as well remove them and make the typedefined types 'pure types'. And since it's a bit annoying to write the names of the typedefined types I might as well rename them to the previous 'pure types'.

I don't see how it would hurt if the standard said how big the types should be. Right now you can't make assumptions about the size, if you could it would only be better. This situation is incredibly stupid, can't believe they didn't fix this for the C++11 standard.
January 8th, 2012, 04:12 PM
D_Drmmr

Re: Thoughts on C++ integer types

Quote:

Originally Posted by Bssldr

I don't see how it would hurt if the standard said how big the types should be.

What if the platform you want to build an application for doesn't support certain sizes? If the standard would demand that certain size types are available, it would be impossible to develop a compliant C++ compiler for that platform.

Btw, if I'm not mistaken std::uint32_t is an optional typedef. If you want your code to be really portable, you should use std::uint_least32_t.
January 8th, 2012, 04:28 PM
monarch_dodra

Re: Thoughts on C++ integer types

The standard doesn't specify type sizes, because C and C++ should be able to run on ANY machine, including machines where bytes are defined as 10 bits, 128 bits, or even with some fancy non-binary tri-bits. As such, the standard defines "int" as the current machine's natural size, and "byte" as its smallest accessible size. From there, it is the developper's (and the compiler's) responsabilty to adapt. C and C++ are machine-oriented first, abstract second.

That said, I do want to question your ideal of using only typedefed integers. Overuse of things like int32 is THE garanteed way to make you code NON-portable.

Sure, when you want to store a number that represents a great phisical ammount, then you can use an int64... But see, this is an invariant: It is meant to hold a big number on any machine.

But, now, say you are writing a "dynamic array class" (vector): Your thinking "better make it portable", so should it be indexed as an int32, or an int64? You have to choose very carefully, because it might be different on different machine, and you want to make your code portable, right?

I'd say either choice is actually the wrong choice: Things like indexes, offsets, etc, are specifically machine dependent, and by design, variant: If the machine supports 64bit addresses, good for it, if not, why burden it with huge integers when the smaller simpler one is ok? The correct answer (imo) is to purposefully use the abstract types such as "size_t" or "ptrdif_t" which is set to the current machine's best size. Your code will run great on any machine, and the beautiful part is it remains portable, yet you never put an once of thought on the actual size.

I can give you the real world example of where I work, where developpers prided themselves on their "portable" use of int32 and int64: the result was that when it came to run the program on a 64 bit machine, it completly failed the portability test. Sure, they had a lot of int64 all overthe place, but they had a few key int32 that should not have been. If you want your code to be portable, use types that are actually portable. int32 and int64 just invariant, not necessarilly portable.

...but, yeah, that's my persnal opinion on the matter. Feel free to disagree.
January 8th, 2012, 04:35 PM
monarch_dodra

Re: Thoughts on C++ integer types

Quote:

Originally Posted by D_Drmmr

Btw, if I'm not mistaken std::uint32_t is an optional typedef. If you want your code to be really portable, you should use std::uint_least32_t.

You are not mistaken. You can also use std::uint_fast32_t too actually, depending on your needs.
January 9th, 2012, 01:52 AM
nuzzle

Re: Thoughts on C++ integer types

Quote:

Originally Posted by Bssldr

What use do I have for the 'pure types' then?

How often must an integral type have an exact size?

And how often is a 16 bit integral type not big enought?

I'd say very rarely and when it happens extreme care should be taken to handle the situation. In 99.9 percent of the cases the natural int is the right choise but when not you have a red alert situation and special precautions are called for; a design change, specialized types, global typedefs, a formula rewrite, or as a minimum a big fat well-documented warning sign.
January 9th, 2012, 01:15 PM
Lindley

Re: Thoughts on C++ integer types

A bigger issue in portability is ensuring that you don't use types in your interfaces which will not be large enough in some cases.

For instance, fseek() uses a long int for its offset parameter. This works great on 32-bit systems, and on 64-bit linux. On 64-bit Windows, however, the long type is still a 32-bit type, which makes fseek hopelessly crippled.
January 9th, 2012, 03:38 PM
ashikthomas

bool datatype

hi

Is the usage of bool datatype really advantageous? Particularly in the case of memory usage and processing speed? Does it save memory if the variable is declared as "register bool"??

thanks
ashikthomas
January 9th, 2012, 05:25 PM
monarch_dodra

Re: bool datatype

Quote:

Originally Posted by ashikthomas

hi

Is the usage of bool datatype really advantageous? Particularly in the case of memory usage and processing speed? Does it save memory if the variable is declared as "register bool"??

thanks
ashikthomas

The point of bool is not really memory size, but rather C++'s strong static typing that garantees that the bool either true or false (and NEVER* anything else). This means you can compare two "true" ints. On the other hand comparing two "true" integers, is harder, as their values will both be non-0, but not necessarily equal.

A bool's size as unspecified (eg not necessarily 1)...
...and unless I'm mistaken, may even be context dependent! eg 1 byte when in a table, but 4 when in a struct... Not sure though.

"register" is deprecated in C++. In C, I think most compilers largely ignore the keyword anyways, since they are so optimized.

"register" is only useful on embedded systems where no powerful compilers are available, and merely do a straight up C to assembly conversion. The compiler then has to rely on humans for help. Of course, once you reach such low levels of code, even things like "bool" become too abstract!

As with most things C++, write code for humans to understand, use types that are most adequate for your design, and don't try to optimize what your compiler is already great at doing.

----
*Unless you have previously done something that qualifies as undefined behavior, a bool can actually be neither true nor false, but that is border-case.
January 10th, 2012, 08:32 AM
ahoodin

Re: Thoughts on C++ integer types

What if the standard says integers will be 64 Bit and new standard processors are 128 Bit? How will I take advantage of the increased efficiency if the standard doesn't permit me to?
January 10th, 2012, 01:05 PM
Bssldr

Re: Thoughts on C++ integer types

Quote:

Originally Posted by monarch_dodra

The standard doesn't specify type sizes, because C and C++ should be able to run on ANY machine, including machines where bytes are defined as 10 bits, 128 bits, or even with some fancy non-binary tri-bits. As such, the standard defines "int" as the current machine's natural size, and "byte" as its smallest accessible size. From there, it is the developper's (and the compiler's) responsabilty to adapt. C and C++ are machine-oriented first, abstract second.

That said, I do want to question your ideal of using only typedefed integers. Overuse of things like int32 is THE garanteed way to make you code NON-portable.

Think of a situation like this: you need to use an 8 byte integer and if it's smaller your code doesn't work as expected. I'd rather see the compiler tell me that the target system doesn't support an 8 byte integer than have a smaller one that makes my program function incorrectly.

To overcome the problem of the target architecture not having a type of required size the compiler could output code that emulates the type by using multiple smaller types.

Quote:

Originally Posted by ahoodin

What if the standard says integers will be 64 Bit and new standard processors are 128 Bit? How will I take advantage of the increased efficiency if the standard doesn't permit me to?

What do you think how long this kind of thing can continue without introducing new types? One day there'll be 512 bit processors - this means that there'll be additional types of sizes 128, 256, 512 bits. Where are you going to put them? Are you just going to take the current int, long, long long and say their sizes are 128, 256, 512 bits? How are people going to use 32 and 64 bit types then? C++ should have fixed size types and new types should be introduced as the processors evolve.
January 10th, 2012, 01:19 PM
Lindley

Re: Thoughts on C++ integer types

It's going to be a while before we really need more than 64 bit addresses. It may be convenient to be able to store and efficiently process larger integers, but address space is what drives processor design.
January 10th, 2012, 01:56 PM
Eri523

Re: Thoughts on C++ integer types

Quote:

Originally Posted by Bssldr

Think of a situation like this: you need to use an 8 byte integer and if it's smaller your code doesn't work as expected. I'd rather see the compiler tell me that the target system doesn't support an 8 byte integer than have a smaller one that makes my program function incorrectly.

That looks like a scenario where one would use a static assert on the size of the type in question. Of course, however, once that actually gets triggerd, you just have the options of either modifying your code or forget about that target system...
January 10th, 2012, 03:19 PM
monarch_dodra

Re: Thoughts on C++ integer types

Quote:

Originally Posted by Bssldr

Think of a situation like this: you need to use an 8 byte integer and if it's smaller your code doesn't work as expected.

In that case, you can use an int64. That said, it is imperative you make a difference of:
*I want 64 bits because I want to hold a big number <- Legit use case
*I want 64 bits because I want to be portable on a 64 bit machine. <- Not legit, because a 32 bit machine WILL require a 32 bit integer.

Quote:

Originally Posted by Bssldr

What do you think how long this kind of thing can continue without introducing new types? One day there'll be 512 bit processors - this means that there'll be additional types of sizes 128, 256, 512 bits. Where are you going to put them?

The day this happens, processors will not be able to access data in blocks smaller than 32 bits, so "byte" will be 32 bits. int will be 128, long will be 256 and long long will be 512. This has happened before: Processors started as 16 bit machines, and today, they are 64 bit. Everything still seems to be working fine.

Ever notice there is no "bit" type in C++? Why do you think that is?
January 10th, 2012, 07:49 PM
Bssldr

Re: Thoughts on C++ integer types

Quote:

Originally Posted by monarch_dodra

The day this happens, processors will not be able to access data in blocks smaller than 32 bits, so "byte" will be 32 bits. int will be 128, long will be 256 and long long will be 512. This has happened before: Processors started as 16 bit machines, and today, they are 64 bit. Everything still seems to be working fine.

Ever notice there is no "bit" type in C++? Why do you think that is?

The current C++ standard says that the size of a char is guaranteed to be 1 byte. The sizes for all other types are defined as the following: char <= short <= int <= long <= long long - they might as well all be 1 byte.

Anyway, I find this whole thing a bit crazy. I was trying to make up some coding standards for myself how to write correct code, but it would take too much effort to be 100% correct. For example when converting an integral value into a byte array I've so far assumed that a byte is 8 bits and my shifts are multiples of 8s. Now it turns out CHAR_BIT not fixed as well.

I program for Windows only so I'll probably just go with the guarantees that Windows gives me on 32 and 64 bit systems. I hope 128 bit doesn't come too soon.

Does anyone happen to know what languages like Java and C# plan to do when the native types of the target architectures go bigger?
January 10th, 2012, 08:36 PM
laserlight

Re: Thoughts on C++ integer types

Quote:

Originally Posted by Bssldr

The sizes for all other types are defined as the following: char <= short <= int <= long <= long long - they might as well all be 1 byte.

Recall that you can determine the size by using sizeof, and that there are minimum limits for the range of each type.

Quote:

Originally Posted by Bssldr

For example when converting an integral value into a byte array I've so far assumed that a byte is 8 bits and my shifts are multiples of 8s. Now it turns out CHAR_BIT not fixed as well.

Consider: how likely is it for CHAR_BIT != 8? If you can write the code to depend on CHAR_BIT, well and good, but if you must assume CHAR_BIT == 8, then write a static_assert to document this assumption.
January 11th, 2012, 03:10 AM
monarch_dodra

Re: Thoughts on C++ integer types

Quote:

Originally Posted by Bssldr

The current C++ standard says that the size of a char is guaranteed to be 1 byte.

Too bad the standard doesn't define what the size of a byte is though! :lol:

I've worked on a machine where a byte was 16 bits. I know for a fact that there are machines out there that define byte as being 64 bits longs.
January 11th, 2012, 03:30 AM
superbonzo

Re: Thoughts on C++ integer types

Quote:

Originally Posted by laserlight

Consider: how likely is it for CHAR_BIT != 8? If you can write the code to depend on CHAR_BIT, well and good, but if you must assume CHAR_BIT == 8, then write a static_assert to document this assumption.

yeah, that's the point Bssldr seems missing; the standard doesn't force you writing 100% portable code; you have the right of choosing how much portable your code should be: you can write code working for a specific hardware or even OS or compiler, or you can write code supposedly working with the full range of hardware supporting a C++ compiler, including esotic ones. Both extremes have their advantages and costs; as always, it's up to you making a profit balance and taking the right choice. This is freedom and it's a d a m n good thing :) ( EDIT: apparently, we cannot write d-a-m-n without spaces ... oh my ... )

as other said, this choice does not necessarily mean making your code less robust, as there are many ways of ensuring correctness both at run time ( through testing, that should be performed in any case ) and compile time, through static asserts, generic programming techniques ( generic code can detect type properties choosing the correct algorithm for a specific (range of) implementation, for example ) or even the old style preprocessor based conditional compilation techniques.
January 11th, 2012, 11:58 AM
Lindley

Re: Thoughts on C++ integer types

Quote:

Originally Posted by Bssldr

I hope 128 bit doesn't come too soon.

Unlikely. 64-bit machines can theoretically address up to 16 exobytes of RAM. That's over a billion gigabytes. Limitations in the OS (either incidental or, in the case of some Windows versions, intentional) have capped it much lower for most platforms, but the point is that we won't need more than 64-bit addressing for many, many years. Probably.
January 13th, 2012, 06:57 AM
JohnW@Wessex

Re: Thoughts on C++ integer types

Quote:

Originally Posted by Bssldr

The current C++ standard says that the size of a char is guaranteed to be 1 byte.

Actually, I think it says that sizeof(char) == 1. As monarch_dodra stated, it just doesn't specify what it is 'one' of.

I too have worked on a 16bit DSP where a 'char' was 16 bit. I've got a feeling that 'short' and 'int' were 16 bit too.
January 13th, 2012, 09:43 AM
laserlight

Re: Thoughts on C++ integer types

Quote:

Originally Posted by JohnW@Wessex

Actually, I think it says that sizeof(char) == 1. As monarch_dodra stated, it just doesn't specify what it is 'one' of.

It does, e.g.,

Quote:

Originally Posted by C++11 Clause 5.3.3 Paragraph 1a

The sizeof operator yields the number of bytes in the object representation of its operand.

The catch is that the number of bits in a byte is required to be at least 8, so it could be 16 bits in a byte, etc.
January 13th, 2012, 10:28 AM
Bssldr

Re: Thoughts on C++ integer types

How do you write networking code when one device has 8-bit bytes and the other one has 10-bit bytes? Doesn't this cause all kinds of problems? Or do the lower layers only extract 8 bits from your 'byte'?

If I'm on a machine with 16-bit bytes and write 10 bytes(which all have been checked to be between 0 and 255) to the hard disk, am I going to consume 20 'standard' bytes? Assuming that the bytes of storage devices are 8-bits long.
January 13th, 2012, 11:50 AM
monarch_dodra

Re: Thoughts on C++ integer types

Quote:

Originally Posted by Bssldr

How do you write networking code when one device has 8-bit bytes and the other one has 10-bit bytes? Doesn't this cause all kinds of problems? Or do the lower layers only extract 8 bits from your 'byte'?

If I'm on a machine with 16-bit bytes and write 10 bytes(which all have been checked to be between 0 and 255) to the hard disk, am I going to consume 20 'standard' bytes? Assuming that the bytes of storage devices are 8-bits long.

Those are very some very good questions. And it could be (it is) even worse than that: Some machine use big endian, wereas others use small endian. Fact: That DOES cause all kinds of problems. However, you have to keep in mind the source of the problem: The hardware. No amount of abstraction layers can really change that.

C++'s, is your chance of a solution. How could you ever hope to write code on a 10 bit machine, when you are using a language that defines that byte MUST be 8 bits?

I'm not saying it's easy. I'm just saying it is a necessary evil.

----

PS: A "standard byte" is called an "octet" - defined as "8 bits"
A byte, on the other hand, is more abstractly defined as "a unit of digital information".
January 13th, 2012, 02:13 PM
Access_Denied

Re: Thoughts on C++ integer types

Quote:

Originally Posted by Bssldr

How do you write networking code when one device has 8-bit bytes and the other one has 10-bit bytes? Doesn't this cause all kinds of problems? Or do the lower layers only extract 8 bits from your 'byte'?

If I'm on a machine with 16-bit bytes and write 10 bytes(which all have been checked to be between 0 and 255) to the hard disk, am I going to consume 20 'standard' bytes? Assuming that the bytes of storage devices are 8-bits long.

This is indeed a problem, one that's usually solved by the networking libraries. For instance, when programming with Unix sockets, there are functions to convert numbers to a standard network format, and then functions to convert it back to machine specific format. This way, a 10 bit byte little endian machine can communicate with an 8 bit byte big endian machine, and neither will ever be aware of the difference.

But a lot of this stuff isn't really THAT big of a deal today. If you're running Windows, 99% chance you're running an x86 processor with 8-bit bytes and a 32-bit processor (or a 64-bit processor that's compatible with 32-bit).
January 16th, 2012, 07:53 AM
ahoodin

Re: Thoughts on C++ integer types

99% if your in the states or an english speaking nation. otherwise you will need unicode/multibyte chars.

Quote:

Originally Posted by Access_Denied

This is indeed a problem, one that's usually solved by the networking libraries. For instance, when programming with Unix sockets, there are functions to convert numbers to a standard network format, and then functions to convert it back to machine specific format. This way, a 10 bit byte little endian machine can communicate with an 8 bit byte big endian machine, and neither will ever be aware of the difference.

But a lot of this stuff isn't really THAT big of a deal today. If you're running Windows, 99% chance you're running an x86 processor with 8-bit bytes and a 32-bit processor (or a 64-bit processor that's compatible with 32-bit).
January 16th, 2012, 07:54 AM
ahoodin

Re: Thoughts on C++ integer types

Seems like you might be turning the corner from this stance.

Quote:

Originally Posted by Bssldr

Think of a situation like this: you need to use an 8 byte integer and if it's smaller your code doesn't work as expected. I'd rather see the compiler tell me that the target system doesn't support an 8 byte integer than have a smaller one that makes my program function incorrectly.

To overcome the problem of the target architecture not having a type of required size the compiler could output code that emulates the type by using multiple smaller types.

What do you think how long this kind of thing can continue without introducing new types? One day there'll be 512 bit processors - this means that there'll be additional types of sizes 128, 256, 512 bits. Where are you going to put them? Are you just going to take the current int, long, long long and say their sizes are 128, 256, 512 bits? How are people going to use 32 and 64 bit types then? C++ should have fixed size types and new types should be introduced as the processors evolve.