Vasyl Latushko

Posted on Aug 30

Understanding Arrays: A Look Inside

#datastructures

The array is the most basic data structure and is widely used in programming. It is a data structure that is a basis for many other higher-level ones. For instance, the Stack<T>, Queue<T>, List<T>, StringBuilder classes in .NET are all using an array under the hood. The proof can be found on the .NET standard source code page.

That's why I think it's so important to learn more about the data structure. So, let's dive deep into it and learn its properties:

static size
index-based elements access
contiguous memory storage
homogeneous elements

Static Size

The static size of an array means that you cannot resize it at runtime without re-creating a new array (allocating a new chunk of memory to store more elements). This is why the Array is not as convenient to use as the List<T> in .NET or std::vector<T> in C++.

So, if you run out of space in the array you would need to create a new one with a bigger size and then copy each element from the old array to a new one which is going to be a complex operation. So, please, be mindful when choosing the size of an array or using the higher-level data structure mentioned above.

Index-Based Elements Access

The reason arrays are so fast is that the element location in memory can be calculated at runtime using the formula:

elementLocation = baseIndex + (index * sizeof(type))

In C++ adding 1 to an address adds the amount of bytes a particular type holds. For instance, sizeof(int) equals 4 on my system and therefore adding one to the address adds 4 bytes and results in 0xe6ddffa80 + sizeof(int) = 0xe6ddffa84

Below is a C++ snippet demonstrating it:

#include <iostream>  

using namespace std;  

int main() {
     //initialize a new array
    int array[5]{1, 1, 2, 3, 5};
    int* baseElementAddress = array;

     cout << "sizeof(int): " << sizeof(int) << endl;
     cout << "sizeof(array[0]): " << sizeof(array[0]) << endl << endl;

    for (int i = 0; i < size(array); ++i) {
        //&array[0] = baseElementAddress = array
        cout << array[i] << " at: " << baseElementAddress + (i * 1) << endl;
     }     

    return 0; 
}

Output:

sizeof(int): 4 
sizeof(array[0]): 4

1 at: 0xe6ddffa80
1 at: 0xe6ddffa84 
2 at: 0xe6ddffa88 
3 at: 0xe6ddffa8c 
5 at: 0xe6ddffa90

The code snippet above simply defines an array with 5 integer values and then iterates through it and prints the value stored at a particular address. That is to demonstrate how the data is stored contiguously in memory.

Contiguous Memory Storage

Below is a visual diagram showing the contiguous memory storage of the data in the array from the previous section:

In the diagram above you can see that each element has the:

index (from 0 to 4)
value in each cell
address

At this point, you should be able to visualize how values are stored in an array.

Homogeneous Elements

The Homogeneous property of an array ensures predictability when accessing elements of the array and operating on them. By predictability, I mean that it's easy to calculate a location address of a given element because the size of each element is uniform and we know the base address. So, using the formula above we can easily find a needed element.

Technically, it's possible to implement an array such that it allowed storing elements of different sizes, but that introduces either complexity of managing such an array or wasted space.

The complexity would be that we would need to define some metadata for each element which would provide us with information on how much memory this particular element occupies so that the location of an element could be calculated. This is additional overhead and therefore is more complex to manage.

The space could also be wasted in the case of defining an array of structures that can hold different kinds of data types. This way, you allocate memory for all of the elements in total and therefore you can store values of different types. But this is not the best practice but rather a waste of memory which is not good.

Conclusion

In conclusion, I'd like to emphasize that understanding how arrays function and store elements are essential, as it enables optimal use of computational power, ultimately making your program run faster. Since arrays are a fundamental data structure, mastering them means that all other data structures will be easier to comprehend and work with.

Please stay tuned, as I will explain other data structures to both myself and others :)

Additionally, don't forget to leave your comments below so we can make this article even more useful!

DEV Community

Understanding Arrays: A Look Inside

Static Size

Index-Based Elements Access

Contiguous Memory Storage

Homogeneous Elements

Conclusion

Top comments (0)

Read next

CRISP-DM: The Essential Methodology for Structuring Your Data Science Projects

Tree data structures in Rust with tree-ds (#1: Getting Started)

Day 3: Introduction to Arrays 📋

Day 1: Introduction to Data Structures and Algorithms (DSA)🚀