In high-level languages like Python and JavaScript, we are pretty used to initializing strings and arrays like this:
let myStr = ""; // an empty string
let myArr = []; // an empty array
We're saying that these are empty, which implies they contain zero bytes of data. So we are storing zero-bytes of data? Huh? Tf is this paradoxical nonsense?
Storing Nothing
An empty array or string doesn't contain any elements or characters, yet it exists. This can feel as paradoxical as carrying an invisible suitcase that weighs nothing. But how do we make sense of this? drumrolls Actually, it's NOT really empty.
Strings: The C Language Perspective
To unravel this mystery, let's take a trip to the world of C programming. In C, strings are typically handled as arrays of characters ending with a null character ('\0'
). But what happens when you want an empty string?
When you create an empty string in C, you are essentially allocating a string that points to a memory location that stores the null character. This is akin to reserving a seat at a table for a ghost. The chair is there, the place is set, but nobody's coming to dinner. Reminds me of my last date.
Here's a simple representation in C:
char emptyString[1] = "";
In this example, emptyString
is an array with a single element (notice the [1]in the declaration), which is the null terminator. The length of the string is zero, but the array emptyString
itself is very much real and occupies memory. It's like having an address to an empty house.
In C, the length of a string is determined by iterating through each character until the null terminator ('\0'
) is encountered. This is exactly what the strlen() function does. It counts characters until it finds the null byte, excluding the terminator itself.
The strlen() function calculates the length of the string pointed to by s, excluding the terminating null byte (
'\0'
).
So strlen(emptyString) will start the counter with 0 but immediately reach the null byte, resulting in a length of 0.
Zero-Length Arrays
In C, arrays are typically fixed in size once declared. So first let's C (ok sorry, see) how we can implement a dynamic or growable array of integers using a struct:
struct IntArray {
int *data; // pointer to the array's data
int length; // current number of elements in the array
int capacity; // maximum capacity of the array
};
Here, IntArray
represents a C struct that manages an array of integers. It consists of:
- data: A pointer to the array's data (which can dynamically grow).
- length: The current number of elements in the array.
- capacity: The maximum capacity of the array, indicating how many elements can be stored before needing to resize.
Unlike traditional fixed-size arrays, instances of IntArray
can dynamically adjust their size as elements are added or removed, akin to how arrays work in higher-level languages like Python or JavaScript. However, in C, this flexibility comes with added responsibility. Developers must manually manage memory allocation and resizing. They need to track the array's length when pushing or popping elements.
When an IntArray
is initialized, memory is allocated based on the capacity in bytes, and the address of the allocated memory is stored in the data pointer. The length field is then set to zero. This process is analogous to what higher-level languages like JavaScript do internally when you create an empty array.
Why This Matters
Knowing that an empty string in C is just a pointer that points to '\0'
character feels good. Curiosity is reason enough! Here's what I tweeted today:
go underneath those abstractions anon, and you'll experience wonders.
Top comments (3)
It's an array — not a pointer. Indeed, in your previous sentence, you correctly wrote "array."
This section is about implementing a dynamically allocated, automatically growing array — so why is the section title "Zero-Length Arrays?"
C doesn't even allow you to have zero-length arrays.
fixed.
well the idea of the blog was to target js/python devs, and in javascript, arrays are growable. so yeah technically array isn't the correct term. it's vector, but mehh doesn't matter much ig.
i was initially gonna mention this in the post that but i came across this thing on stackoverflow which i myself didn't understand much so i left it out.
thanks 🫡
That's a flexible array member and is a special case. See here.