You're probably thinking
"Arrays, isn't that some super basic shit? I use em all the time!"
Your right, but apart from the "arrays are used to store multiple things?"
Do you actually know how it works under the hood?
If you got any doubts, no need to worry.
It's a lot simpler than you think.
So without further ado, this is what we are going to cover today.
- What is an array?
- How are arrays stored in memory?
- Advantages of using arrays
- Additional References
As programmers, our first instinct is to google stuff.
If we google "what is an array?"
We will get something like this:
Okay, this has a lot of jargon words, let's break this down:
- An array is a data structure
- Collection of elements
- Identified by an index
So I guess from this we can conclude that:
An array is simply a container that contains a bunch of elements, which are indexable.
This is a very valid question, it basically means that each element can be referenced with a number.
But if you didn't get that let me show you an example:
Here we initialized an array with 5 elements (56, 50, 34, 24, 12) called
scores, and as you can see each element has its own reference or index but in programming, we always start counting from 0. Hence, it is called zero-based indexing.
With this we can easily access the elements:
scores = 56
scores = 50
scores = 34
Another property that arrays have is that the elements must be of the same type. For example for an integer array, all elements must be integers. Likewise for string, chars, etc...
You may say "That's wrong, I've used different types of elements in my arrays before".
The issue is that when we are studying data structures, it is language-specific (C, C++, Java).
Other programming languages have different data structures.
In general, there are two types of arrays:
- Static Arrays - Arrays with a fixed size.
- Dynamic Arrays - Arrays with a dynamic size (size can change).
- Multi-Dimensional Arrays - Arrays inside arrays (nested arrays)
Let's imagine we initialized an array that looks like this:
This is a static integer array that has four elements, but how does it actually work?
- The compiler will see that we need to create an array of integers, it needs to allocate enough memory to support four integers.
- But how much memory do four integers take, well through the power of the internet we know that an
intis 32 bits or 4 bytes.
- We have four integers so it will need 16 bytes to store all four integers
(4 * 4 = 16)
Before we move on, I have to explain how memory (RAM) gets allocated, it's pretty complicated but to simplify things imagine a series of blocks and each block holds one byte (8 bits) within this block you can store your information. Another thing is that each block has a special identification number.
Okay, I hope you got the gist of it, let's go back to arrays.
- We allocate 16 blocks in memory to save 16 bytes.
As you can see above, the data is sequential. First comes the first integer then directly after the second, same thing with the third and fourth elements.
The reason why we cannot add a fifth element is that memory blocks 26→29 may be used up by some other variable, and as we know all data in arrays must be sequential. Hence it's a static array.
In an array the values are close together in memory, this makes it very easily accessible from the CPU to the cache. This means that iteration in an array is much faster than any other iteration.
In both hash tables and arrays, the access time is provided. However, in hash tables, it's a bit more complicated. While for arrays, the system is well aware of the precise address of the array and wherein memory is allocated and stored. Hence accessing arrays is not only fast but also predictable.
In comparison to linked lists and hash tables, arrays are just much easier to debug. It can be directly traversed with the index position.
An array requires memory space only for the values, the start address, and its length. On the contrary, a linked list needs a pointer for every value which is inserted. It acquires memory for every address and also when extra data is inserted it also needs memory for the same. The Hash table also needs memory depending on how it is implemented. This implementation decides how memory is allocated and usually, it requires extra allocation.
An array is considered to be a homogenous collection of data. It helps store multiple values of similar types. For any purpose, if the user wishes to store multiple values of a similar type, an array is the best option that can be used. As a result for any purpose if a user wishes to store multiple values of a similar type then arrays can be used and utilized efficiently.
An array is also a collection of data that stores data of the same type and in a sequential manner. As this data is stored in a sequential manner it is efficient to track it by using just its index values. This is not easy when taken into consideration the non-sequential data structures. In these cases every time you need to traverse to a particular desired position and then access its value.
One of the major advantages of an array is that it can be declared once and reused multiple times. It represents multiple values by making use of a single variable. This helps in improvement of reusability of code and also improves the readability of the code. If in this situation no array is used then we will need to store multiple values in multiple variables.
These can be defined as an array of arrays. Data that is present in a tabular format like 1D, 2D, etc. can be defined. The total number of elements can be stored in the multi-dimensional array and can be calculated by multiplying the size of all dimensions.
- Accessing: because we know the index of the element, accessing a value is always in constant time.
- Searching - the bigger the arrays is more we will have to iterate over making its linear time.
- Insertion - you cannot insert a new value into static arrays, while in dynamic arrays you can but your gonna have to change the indexes of each element. Hence making its linear time.
- Appending - does not apply to static arrays, but in dynamic arrays, it's in constant time because we already know the last element of the array, and we just have to add another element after it.
- Deletion - does not apply to static arrays, but in dynamic arrays, it's similar to insertion because of the change of indexes. Hence making its linear time.
This has been the second part, of the data structures series. Next time we will talk about linked-lists. If you have any questions feel free to leave them down below and I will answer them as best as I can.