One of the first things to realise about C is that a lot of important things are left as implementation details. The C standard doesn't actually define the size of an int
, or a float
or even a char
for that matter. Add to the fact that the notion of a simple boolean value doesn't really exist as part of its core data-types.
So, a header file called stdint.h
was introduced in C99 to allow programmers to write code with typedefs for common data-types that specified their underlying sizes.
This allowed for programmers to write code with data-types independent of the architecture of their target machines. This is especially useful for a language like C that is designed to run on the smallest of embedded systems all the way to the fastest of supercomputers.
An integer on any modern 64-bit machine takes up 32-bits by default. So, for consistency's sake, instead of using int
, use int32_t
or, instead of char
, use uint8_t
which, as the name hints, refers to an unsigned 8-bit integer, which is sufficient to store any possible ASCII character.
#include <stddef.h>
#include <stdint.h>
int main(void) {
int number_1 = 5; // ambiguous size (depends on platform)
int32_t number_2 = 10; // fixed 32-bit size on all platforms
char character_1 = 'a'; // mostly 8-bits (but on some embedded architecture, you never know!)
uint8_t character_2 = 'b'; // fixed 8-bit size on all platforms
size_t length = 5;
int32_t* array = malloc(sizeof(int32_t) * length);
for (size_t i = 0; i < length; i++) {
array[i] = i; // No fear that i can ever go out of bounds.
// Also note the implicit type-cast from size_t to int32_t.
}
free(array);
return 0;
}
This also extends to using size_t
and bool
from 'stddef.h' and 'stdbool.h' respectively. The latter, as is obvious is just a typedef over 0
for false
and 1
for true
to make it more clear whether something is a boolean and to disallow any other value on accident if using something less restrictive, like an int
.
size_t
is where things get more interesting. Officially defined as an unsigned integer of a size that is at least 16-bits, it is used to define the size of objects. That doesn't seem very platform independent, does it? Well, it's not supposed to be.
size_t
is great at holding object sizes and of particular note is its use in representing array lengths, and therefore, array indices. Instead of using an int
which may either be negative, or either too big, or too small, size_t
remains the perfect data-type to use in such scenarios.
Another nice use of size_t
is that it is meant to represent the size of any data-type by design. So, the sizeof
operator must return a value compatible with size_t
.
Top comments (3)
char
that is not 8 bits, then the typeint8_t
will not exist on that platform. This is why the typeint_least8_t
(and the other "least" types like it) exist.sizeof(char)
is always 1.sizeof
operator returnssize_t
exactly.Thanks for the corrections. Not too sure about a couple of points though.
sizeof(char)
is always 1 from C99 onwards. I don't think it is standard before it. Plus, that notion of 1 is not necessarily the number of bytes rather, the smallest "memory unit" supported by the compiler. All other results are just sizes measured as number of chars.According to The C Programming Language, 1st edition, page 126:
So
sizeof
has apparently always been inchar
-sized units. I'd guess that in most casessizeof(char)
was typically 1 — it's just that the standard didn't require it until C99.Some historical supercomputers made all sizes be the same for speed at the expense of memory, but
sizeof(char)
was still 1 even if the underlying addressable unit was 60+ bits.Yes, using negative indices is rare, but it's possible to do safely. Look at C code generated by yacc sometime. It generates code where
yyvsp
is a stack pointer andyyvsp[-1]
refers to the element just below the top of the stack,yyvsp[-2]
to two elements below, etc.When explaining anything about a programming language, you have to be (1) precise and (2) complete.