Sometimes we make silly mistakes. We have a brain fart or just totally forget something. A few weeks ago I had a mock technical interview to help prepare me for the adventure that is the job search. The Senior Engineer who was interviewing me presented me with a problem to solve. After making sure I understood the problem, I began to code out my solution. Now, I’m not really going to talk so much about my solution or the refactor that followed (stay tuned for a future blog post about Big O!), but I would like to focus on a simple question that I managed to get tripped up on.
While it may seem super obvious for many of you reading, the question I got wrong was “tell me a data structure.” What did I do wrong? I started naming data types. Now, maybe you been in the computer science/coding world for a while and the answer is super obvious. Maybe you’re just starting out and you’ve never even heard of a “data structure” or “data type.” Either way, I want to quickly talk about what they each one is and officially clear this one up.
Essentially, a data type is...exactly what it sounds like. It is the attribute of a piece of data that tells us what we are working with and how we're going to use it. A data's type is what allows us, as programmers, developers, and engineers, to understand the difference between
8.0. What's the difference there, you ask?
8 is an integer. An integer is a whole number that we can use in a whole slew of ways. Computers understand numbers well and we can use integers if we ever need to perform an equation (that will result in a whole number) or if we need to be specific about how many times to perform a loop or if we need to find the index position of something in an array or what-have-you.
"8" is a string. Strings are enclosed in quotation marks and are short pieces of text. These are not to be confused with the text data type, which has a greater character limit than a string.
8.0 is a float (or floating point). Unlike an integer which is a whole number, a float contains a decimal and can be more exact.
Each of those is an example of a unique data type that is used differently, even if, on the surface, they look like they might all be the same data. Data can exist in a whole bunch of different ways and, depending on the coding language that you use, you may have access to a large variety of data types including the aforementioned data types and other (shout out to booleans (true or false)!).
Ah, the crux of this post. The white whale that eluded me during my mock interview...
Wikipedia defines a data structure as "a data organization, management, and storage format that enables efficient access and modification." To clarify, while a data's type establishes how we can use it, a data structure is a way of organizing and/or storing data. Some examples of data structures include:
Arrays -- containers for collections of data. You can think of an array as a list in code form. They are wrapped in square brackets
[ ] and can contain any data types in any combination. We can name arrays and have the ability to access, add to, or change data stored in an array. It is also worth noting that data stored in an array is stored by number (starting at 0). as opposed to...
Hashes -- Hashes are similar to arrays, but rather than storing data by number, hashes store data by name. Hashes use key-value pairs and require unique keys. Because of this, data stored in a hash is associative, meaning that each individual key points to data that is specifically related to that key. Depending on the language, hashes vary slightly and are known by different names, though they are perhaps most commonly seen as hash tables. (Languages like Ruby and Python have built in hash support, but not all of them do.)
Stacks & Queues -- Stacks and Queues are what I would call "cousin data structures." They both take in data and store it sequentially. The difference between these two data structures is that stacks take in data; stack it up; and then, to access whatever data we need, needs to go through all the data from the most recently added until what it's accessing. So, if something is added at the very beginning of a stack and then a bunch more is added, everything that has been added has to be taken off first (from most recent onward) before we can access the data at from the beginning. This principle is known as FILO (First In Last Out). Queues, on the other hand, work via a principle known as FIFO (First In First Out). If we added something to the beginning of a queue and then needed it after added a bunch of other data, we would be accessing the data from the beginning immediately, rather than having to take everything off the stack. I think of queues like being on a flight and seeing the folks in first class getting to board and disembark first... although maybe that's just because it's been so long since I've been able to fly anywhere... (#covidtimes)
Sets -- Sets store unique values and can be very helpful for finding out if something belongs to a set of values or if you just want to learn how many unique values exist in a given set of data (for example, you can convert an array to a set and call
.size on it).
Anyway, there are a bunch of other data structures as well (ex. graphs, trees, etc.) and perhaps I'll get a bit more into them in a future post, but I hope this was helpful for you and that you feel more prepared for the next time someone asks you about data types or structures.