Table of Contents
Introduction
Structs are low-level data structures that store complex data structures. A struct consists of multiple fields that store data of different or the same data types. Structs are similar to arrays in some ways. Like arrays, the data of a struct is stored in a contiguous block of memory.
The variable to which an array is assigned becomes a reference variable to the first element of the array. Likewisely, the variable to which a struct is assigned becomes a reference variable to the first field of the struct. In arrays, you can navigate through the elements by adding the size of an element to the address of the first element.
For example, you can get the second element of an integer array by adding 4
to the memory address of the first element and you can do the same procedure multiple times to get consequent elements. The process of navigating through the fields of a struct is similar to arrays. Theoretically, you just need to add the size of the first field to its memory address to get the memory address of the second field. This is theoretical because it is not as simple practically.
Memory alignment
Memory alignment is the process of aligning the data in memory in such a way that is efficient and favorable for the processor. Data is aligned in memory by padding the data. Padding the data means adding some empty bits to the head or the tail of data in order to align it properly. There are some rules for structure alignment. Structure alignment is achieved by structure padding and both the processes complement each other.
Structure Padding
The memory alignment of the structure depends on the structure padding. The alignment and padding change according to the size of the various fields of a struct. There are two main rules to determine the padding and the alignment of the struct. To explain both the rules we would take the following struct as an example:
struct example {
char i;
int j;
} example1;
We need to know the size of the various fields of the struct to determine the structure padding. The compiler knows the size of the fields by their particular data type. The field i
is of 1 byte because it is a char
. The field j
is of 4 bytes because it is an int
.
The offset of a field of a struct is the amount of memory between its memory address and the memory address of the first field of the struct. In the struct mentioned above, the offset of j
is 1
because there is 1 byte memory between the memory address of i
and its own memory address.
The first rule of structure padding requires the offset of a field to be divisible by the size of the same field. For example, in the above mentioned struct the offset of j
must be divisible by its size(4 bytes). For making its offset divisible by its own size, we need to add padding in between i
and j
. So to make the offset divisible by 4, we need to add 3 bytes of padding. So, there is 3 bytes of padding in between i
and j
.
The second rule of structure padding requires the size of the struct to be divisible by the size of the largest field of the struct. For example, consider the following struct:
struct dog{
person* owner;
int age;
} bruno;
The first field of the above struct is a pointer to another struct so its size is 8 bytes. The size of the second field is 4 bytes because it is an integer. No padding is needed in between owner
and age
because the offset of age
is 8 bytes which is divisible by its size 4.
The second rule requires us to add padding at the end of the struct. The size of the struct(8+4=12) is not divisible by the size of the largest field of the struct(8 bytes). So to make its size divisible by 8, we will have to add padding at the end of the struct. We need to add 4 bytes of padding to make its new size divisible by 8. After adding 4 bytes of padding, its new size, i.e. 16 bytes is divisible by 8 hence following the second rule of structure padding.
Aligned structs
Data needs to be aligned properly for efficiency and performance of processors. If you have a 8-byte sized field in a struct, you should place it at a memory address divisible by 8. If you do not do so, the data stored in that 8-byte field would be called misaligned. Reading misaligned data is slow for the processor and the processor might not support a misaligned value, resulting in a program crash.
The first rule for structure padding requires the offset of the 8 byte sized field to be divisible by 8. You might need to add padding to make its offset divisible by 8. Suppose you have an integer before the 8-byte sized field. You would add 4 bytes of padding to make the offset divisible by 8. In this process, you have also made the memory address of the integer divisible by 8.
So in the process of making a field aligned at a memory address divisible by its size, you have aligned the struct by storing it at a memory address divisible by its alignment(the size of the largest field of the struct). The struct is considered to be stored at a memory address divisibly by 8 when its first field is stored at a memory address divisible by 8.
Advantages
There are a lot of advantages of memory alignment. Most processors work the best on memory aligned data. Memory alignment in structs is very advantageous because it prevents a struct field or a struct to be stored across two memory blocks. Fetching data from multiple memory blocks is very inefficient for the processor.
Also, the storing of data over two memory blocks might also chop the data. For example you are storing an array of structs in which the struct consists of a pointer and an integer. The struct is misaligned and has a size of 9 bytes. In case the array contains a lot of data, eventually the memory would all be used and the last instance of the struct would be chopped because there would be
likely no memory address divisibly by 9.
Bibiliography
-
A reddit post about structs in r/computerscience
- Special thanks to u/AuntieSauce, u/Poddster and u/JojoModding who helped me out in understanding the internal workings of structs
Top comments (0)