DEV Community

Thuy Doan
Thuy Doan

Posted on

Big O Notation as a Mid-Level Developer Who Has Been Avoiding It Since Bootcamp: Arrays and Time Complexity

At the beginning of this year, I was promoted to intermediate developer 🎊

At your company, that might be an IC2 - or whichever level is after your entry level developer, but right before the senior developer. In any case, I was now at a place in my career where computer science fundamentals needed to be stronger compared to the beginning when I could just throw myself into building things with what I learned in full-stack Javascript bootcamp.

I decided I needed to better understand data structures and be more comfortable with algorithms. Not because I wanted to leetcode more. I really don't want to leetcode more. But I couldn't shake the feeling that I would be better off if I understood more why data structure A over data structure B.

So I reached out to a friend for help and this is what I've learned 🤓

What did I know about Big O notation?

My mental model of Big O has always been this:

1) A unit of measurement
2) Related to computer science that
3) Describes the complexity of things

From here, I needed to understand why? 💭

Why must we measure the complexity of things?

As developers, we deal with data.

Sometimes not very much of it, like on a static website. Sometimes a whole lot of it. The multi-millions of users kind. And most of the time, that data is not in a format that we need and we need to manipulate it. Sort it, filter it, or find something. Sometimes we even need to change it into an entirely different format! And how efficiently we do that matters at scale.

What's also true is that there are many ways to solve a problem. This is especially true in programming. You can then think of Big O notation as a way to describe how efficient a solution is relative to another one.

What types of Big O notation are there?

In this post, we'll focus just on the types that apply to arrays but know there are a number of them that you can see below:

Graph depicting different big o notation time complexities with O(n!), O(2^n), O(n^2) as horrible; O(n log n) as bad; O(n) as fair; and O(log n) or O(1) between good and excellent

Source: Big O Cheatsheet

For arrays, you can have 2 types of time complexities (or Big O):

1) Constant time or O(1)
2) Linear time or O(n)

Table with two columns with headings: Operation and Time Complexity, respectively. Under Operation, it lists 4 items: Searching, Access, Insertion, and Removal. Under Time Complexity, we have O(N) or linear time for searching; O(1) or constant time for access; O(N) for insertion unless it's inserting something at the end of the array, in which case it is O(1) or constant time; and lastly, O(N) for removal unless removing something at the end of the array, in which case it is also O(1) or constant time.

Source: Big O Notation for Arrays by KodinKevin on YouTube

With Big O, the n refers to the amount of data you are working with.

Practical Examples

Example A. Kanto Starter Pokemon

Kanto Starter Pokemon. There's a cute orange lizard, a grassy type of creature that has a flower bulb and walks on all fours, and a blue turtle that looks cheerful.

Let's say you're building a Pokemon app and you have an array of Pokemon.

const kantoStarters = ['Charmander', 'Bulbasaur', 'Squirtle']
Enter fullscreen mode Exit fullscreen mode

If you know the index of Squirtle in the array, you can access it by simply doing kantoStarters[index]. If this was instead an array of all 151 Kanto Pokemon, the number of steps it takes to access a Pokemon at a known index will be the same as when there were only 3 starter Pokemon because you can go directly to the index of the Pokemon. Hence, access in an array is considered constant time - also known as O(1).

Because constant time takes the least number of steps to complete an operation, it is considered the most efficient. Check that first graph out again!

Example B. All Kanto Pokemon

Let's say instead of knowing where exactly to look for a Pokemon in an array, we have to flip through it like a clothing rack at the mall or files in a filing cabinet. In this case, it would take at worst as many steps as there are Pokemon. Remember that n in Big O notation stands for the amount of data we are working with. So should we have to look through an unordered array of all 151 Pokemon to find a Psyduck it would take us O(n) steps. This is called linear time because given more data we take proportionately more steps.

Animation of flipping back and forth through a book

At this point, since constant time - or O(1) - takes a constant amount of steps, no matter the amount of data versus linear time - or O(n) - which takes proportionately more steps when given more data, we can say that constant time is faster or more efficient than linear time 💨

Example C. It Depends

Once we move into insertion or removal of data into an array, it gets a little nuanced. Let's say we create a new type of Pikachu that wears a coloured party hat (think Nintendo 64 Super Smash Bros) and we wanted to officially recognize it as a Kanto Pokemon: Party Pikachu. If we add Party Pikachu to the end of the list of Pokemon, that would only take one step. Hence, insertion at the end of arrays is constant time - or O(1). The same goes for removal.

4 versions of Pikachu wearing different coloured party hats

It's different, however, if we're trying to insert or remove an item from any other place in the array. Why? If we added Party Pikachu to the beginning, all the indices of the Pokemon after it would have to change because the order of Pokemon is now different. This also applies if Party Pikachu were to be added in the middle of the list. We would have to take as many steps as the number of Pokemon that come after it to change the indices to the new ones. Hence, insertion or removal anywhere but the end is linear time - or O(n).

const originalKantoPokemon = ['Bulbasaur', 'Ivysaur', 'Venusaur'] // and so on
// Where Bulbasaur is index 0

const newKantoPokemon = ['Party Pikachu', 'Bulbasaur', 'Ivysaur'] // and so on
// Where Bulbasaur is now index 1
Enter fullscreen mode Exit fullscreen mode

Career Value

You might be thinking, "That's great and all but why do I need to know this?" That's fair. I've been able to have a successful last 4-5 years as a developer without it. Heck, I even got promoted. But there's two possible reasons:

1) You want to get hired at a company that does leetcode.

FAANG companies - also known as Facebook, Amazon, Apple, Netflix, and Google - or similar, are infamous for testing leetcode, algorithms, and data structures in their interview process. If you want to get hired by them, you need to be able to reference Big O when you write an algorithmic solution.

2) You need to come up with efficient solutions.

Even if you avoid interviewing for companies that do leetcode, you will still have to work with data. And unless you can always work with a small amount of data, how performant the solutions you write to handle data will be important. Especially as you become a more senior engineer.

(This will become more apparent as I continue this series by moving into showing actual algorithms. Follow me and stay tuned!)

I'm personally in the second boat but I've since been opening myself up to the idea of the first one. Let's get better first then we'll see 🤡

Onward

I was the kind of kid who was, for all intents and purposes, intelligent but didn't identify with being good at STEM subjects despite being an honour roll student throughout my education. Heck, my favourite subject was music. But at some point, you hit a wall that makes you realize your work could go much more smoothly if you deepened your knowledge in a particular area 🚀

My goal is to be able to confidently answer why we should store data a certain way (i.e. dictionary vs. list) or traverse large amounts of data in a certain way, no matter if I'm being asked in an interview or if I simply have to complete a task for a job I'm currently employed for 💃🏻

You can think of what we discussed so far as the building blocks for choosing between multiple ways of handling data. If we know that searching through an array is linear time and we later find out that there's an alternate solution for searching through data that is constant time, which is faster, we might want to use the latter solution. However, there's other things to weigh, like readability and maintainability. More on that another time.

I'll keep learning and be sure to share more 😬

Off to study linked lists!

Keep it candid,

Thuy 🙋🏻‍♀️

Note: This post focuses more on practical examples than it does on mathematical visuals. This is because not everyone will understand Big O with mathematical graphs. But if you are someone that will, I recommend this.

Discussion (8)

Collapse
aminmansuri profile image
hidden_dude

We don't always need things to be super duper fast.

Often times one may prefer an O(N log N) algorithm over an O(N) one because the code will be simpler. For example, in some cases you can avoid sorting things, but doing so will make your code 5x longer and more complex.

However, in general we want to avoid O(N^2) type stuff such as this Java code:

String s = "";
for (int i = 0; i< N; i++) {
    s += "something";
}
Enter fullscreen mode Exit fullscreen mode

This sort of thing can appear in our code by mistake and can really create a lot of slow downs. And sometimes we say: "ah.. no big deal.. I'll just make it search through the list every time.. how long can that list really be?"

And bad things can happen. For example:

while (... ) :

       for x in list: 
            if x in otherList :
                 // something
Enter fullscreen mode Exit fullscreen mode

Adding a line to convert otherList into a set would make this algorithm linear instead of quadratic.

Rarely, it really really does make sense to optimize it to as fast as possible. But that is generally when we're optimizing some process that takes minutes or hours, and we really want to bring that down as much as possible. Or if we're doing something realtime.

Fortunately, often you can solve things by just using a dictionary (or hash map) or a set (or hash set) to speed up your code a lot just using the datastructures your library provides.

So as a rule of thumb:

  1. don't do obviously slow things (like I showed above)
  2. keep your code simple (you don't have to use a heap each time or some fancy tree that will make your code too complex)
  3. Use built in tools like hash maps to speed up your code
  4. When it really really really matters, get super fancy but make sure you really need to go there
Collapse
aminmansuri profile image
hidden_dude

Leetcode can be expensive to build and expensive to maintain. So make sure it's worth it.

Also, in the real world if you're dealing with tons of data, often you delegate that hard work to the database. So writing a super fast SQL query can be more important that importing it all in RAM and running some fancy binomial queue based super treap solution. ;)

Collapse
clearlythuydoan profile image
Thuy Doan Author

We don't always need things to be super duper fast.

I had mentioned that there are other factors to consider like readability and maintainability so definitely agree with this! "It depends" hehe.

Collapse
anthonydmays profile image
Anthony D. Mays

Well done! This is a great intro to Big O!

One thing I noticed is that you said adding to an array is constant time. Not quite correct since arrays in JavaScript really work like vectors. A true array is immutable, so if you want a bigger array than what you've got, you have to create a new, larger array and copy the old elements over to it, making the resize cost O(n). Vectors hide this from you to make it easier to work with them.

Collapse
clearlythuydoan profile image
Thuy Doan Author

Interesting! I don't know too much about vectors. If you have any intro resources, let me know. Thanks :)

Collapse
anthonydmays profile image
Anthony D. Mays

Sure! This article seems helpful: stackoverflow.com/questions/150790...

Note that adding to a vector is amortized constant time, meaning that it essentials averages out to constant time in normal situations. You'll also notice that retrieval from a Map or Dictionary is also linear time, but is amortized constant time.

Looking forward to your next article!

Collapse
perezcontrerasluis profile image
PerezContrerasLuis

Wow, excellent introduction, I'm just picking up the data structure and I have identified with your article because I think we share the same reasons ^_^

Collapse
clearlythuydoan profile image
Thuy Doan Author

Awesome! Just the tip of the ice berg for me. Many more data structures and Big O notation types to go haha. We can do it :)