Hunter Johnson for Educative

Posted on Dec 22, 2022 • Updated on May 2, 2023 • Originally published at educative.io

3 challenging System Design problems for your interview prep

#career #systems #architecture #productivity

This article was written by Educative's Co-founder and CEO, Fahim ul Haq.

Before I co-founded Educative, I was a distributed systems engineer at Facebook and Microsoft for a little over 10 years. While working with web-scale distributed systems, I was party to hundreds of Software Engineering interview loops covering both Coding and System Design Interviews (SDIs).

The System Design Interview can be daunting. Understandably, engineers with no real-world experience designing large-scale systems find SDIs intimidating. Beyond that, SDIs differ greatly from most technical interviews. With no fixed right and wrong answers, the open-ended nature of System Design questions leaves many interviewees lost. These three questions we will cover are some of the most open-ended and complex I've encountered.

From my experience conducting System Design Interviews, there are always questions that tend to trip up even the most prepared candidates. Today, I want to help you make sure you're not caught off-guard during your interview. In that spirit, I'll talk about three of the most challenging System Design questions I've asked and how I would approach them.

We'll cover:

Designing Uber
Twitter's heavy-hitter problem
Capacity estimation problems
Continue practicing with more SDI questions today!

1. Designing Uber

Designing Uber or other rideshare services is a complicated process. Between our two main subjects, the static requester (the rider) and the dynamic receiver (the driver), a lot of tasks must be completed for every single interaction on the Uber app. The customer and driver connection must include the following:

Drivers being able to notify the service regarding their current location and availability frequently
Passengers are able to see all nearby drivers in real-time
Customers requesting rides using a destination and pickup time
Nearby drivers are notified when a customer requests a pick-up
Once a ride is accepted, the driver and customer must see the other’s current location for the entire duration of the ride
Once the drive is complete, the driver should then be available for another customer

Obviously (as seen above), there are multiple parts of this system that need to be put in place for the entire operation to run smoothly. To simplify your approach, I suggest taking a step back and focusing on the core problem.

I view the core problem as the part of the system that is most difficult to execute. The most advanced portion of Uber's system is not how to store and fetch the data being collected, as that kind of task is done by many large-scale systems.

Our core problem that is unique to the design of Uber is matching a (relatively) static rider with a (potentially) moving driver. The platform must also tell both the driver and the rider of the car’s current location while the ride is in progress.

I'd suggest tackling this problem before any other. Without the core problem being met, the rest of the system's components are rendered obsolete.

A design problem similar to Uber's is that of designing Yelp. I consider this an excellent prerequisite because it experiences many of the same obstacles as Uber. They both store massive amounts of location data and aim to achieve quick match results based on the location of two entities.

Both designs benefit from the use of a QuadTree data structure to connect customers to their destinations. However, for Uber, the QuadTree must be adapted for frequent location updates. For the most part, Yelp's system connects two static points (the customer and the restaurant), while Uber must connect a moving point to a standing one in the same amount of search time. Once you've gotten comfortable with Yelp's design and the QuadTree, you can tackle Uber's design, as it is more complicated.

Side note: Identifying and solving a "core problem" first and then branching off to its related system components is a technique that works for any SDI question. That said, I particularly recommend this approach for the type of design question with a unique and challenging core problem. Think of it as eating your least favorite thing on your plate so you can enjoy the rest of your meal pain-free!

Assuming that we're working with the data structure from our Yelp design problem, Quadtrees help to divide the map into segments. If the number of drivers exceeds a certain limit, then we split that segment into four more child nodes and section the drivers into them. Each child node in our QuadTrees contains segments that can’t be divided further. We can use the same QuadTree segments for connecting drivers. The most significant difference between the QuadTree used in the Yelp and Uber designs is that our QuadTree wasn’t designed with regular upgrades in mind. It takes a longer amount of time to modify the QuadTree whenever a driver’s position changes. To identify the driver’s new location, we must first find a proper grid depending on the driver’s previous position. If the new location doesn’t match the current grid, we should remove the driver from the current grid and shift it to the correct grid. To fulfill our system requirements and solve the problem at hand, we must upgrade our data structures to update driver locations every four seconds.

To pull off driver updates every four seconds, I'd suggest starting with a hash table to store the latest position of the drivers and updating our QuadTree occasionally, say after 10–15 seconds. We can update the driver’s location in the QuadTree around every 15 seconds instead of four seconds, and we use a hash table that updates every four seconds to reflect the drivers’ latest location. By doing this, you are achieving your main aim while using minimal resources and time.

This step is merely the beginning. Uber's design is a complicated and vast issue. To continue learning about how to solve this design problem, check out this page from our Grokking Modern System Design Interview for Engineers and Managers the breaks down all parts of Uber's design.

2. Twitter's heavy-hitter problem

This problem is one that affects many notable companies that you probably interact with on a daily basis. More often than not, it is also a problem that I have seen spark confusion among qualified candidates.

So, what is the heavy-hitter problem? For example, imagine we are designing Twitter. A problem Twitter's system faces every day is when a single account with millions of followers, such as Justin Beiber, sends a tweet. This single tweet from a single account may generate millions of comments, views, and likes per second. As seconds tick on, the tweet will continue to increase in the amount of traffic it's generating. The system as a whole has to be built to deal with this sudden surge of user requests and spikes in traffic from a single post. This problem is known as the heavy-hitter problem for companies with a significant celebrity presence, such as Twitter, Facebook, and YouTube.

So, how do you approach problems such as the heavy hitter? It's not easy to simply count the number of views by keeping a counter on a single machine. So, we use sharded counters. A sharded counter, also known as a distributed counter, is where multiple counters are responsible for a specific number of shards. These shards run parallel on different computational units. We can improve performance and limit contention by balancing the millions of write requests across the shards. The shard counter is a critical design element when scalable counting is necessary. Being built for the scalability of large-scale counting makes this solution a perfect match for our heavy-hitter problem.

To get more practice with sharded counters, check out our Complete guide to System Design in 2022!

3. Capacity estimation problems

Capacity planning is not the most frequently asked type of SDI question, but it is a vital process for large-scale FAANG systems. First off, what is a pure capacity problem? The core problem you will face is the estimation of resources required to cater to user requests for a specific design problem. For example, consider the amount of storage space YouTube requires to store videos uploaded each day and the servers and bandwidth required to help stream videos to users across the world.

So, how would I like to see candidates solve this problem? Since the capacity of each company and system are different, there really is no one go-to solution for this kind of question. A tip that helped me immensely is to research the bandwidth and capacity required for the company you're interviewing with.

For example, in YouTube's storage estimate, we have to gauge the total number of videos and the length of each video uploaded to YouTube per minute. Let’s assume that 500 hours worth of content is uploaded to YouTube in one minute. Since each video of 30 MB is 5 minutes long, we require 30/5 = 6 MB to store 1 minute of video.

The following formula then looks like this:

Of course, not every upload is the same. YouTube requires more storage than we calculated and demands more flexibility. More often than not, capacity planning estimates from the design stage don't hold up in reality. That's why being prepared to be flexible and plan continuously is essential for capacity planning.

Calculations like the example above help us ignore the nitty-gritty details of the system (at least at the design level) and focus on more critical aspects. These calculations are referred to as "back-of-the-envelope calculations" and do not represent the capacity reality on their own. In addition, we have to make some underlying assumptions, which will be the basis of our capacity calculations.

For instance, we can estimate the storage capacity of each server and then the number of storage servers required to store all the videos. We may also assume the replication number of the chunks/segments of a video, the latency time for 1 KB of data to travel from one data center to another, and so on.

To keep learning about this flexible process for solving capacity problems, including resource estimation and back-of-the-envelope calculations, check out the rest of the YouTube lesson from our Grokking Modern System Design for Engineers and Managers course.

Capacity estimation problems should be discussed more in System Design Interview preparation. In my opinion, because the design doesn't always transfer to reality, capacity estimation is one of the more critical and challenging aspects of modern large-scale System Design.

Continue practicing with more SDI questions today!

System Design is a vital part of today's world, and I always love seeing more people enter the field each year. The System Design Interview can be tricky no matter who you are. That's why I want to share as much expertise as I can for the questions that I've seen result in confusion among candidates.

Good luck on your journey! Now that you are aware of these problematic questions, make sure you practice them before the real deal! Our System Design Interview courses are excellent places to find solutions to these problems and get hands-on practice. You now know where you might be vulnerable in your interview. Make sure you're preparing where others might be underprepared.

Continue your System Design Interview preparation with Grokking Modern System Design Interview for Engineers & Managers
for more practice with the problems we discussed in this article and many more.

As always, good luck and happy learning!

Continue learning about how to conquer the System Design interview on Educative

Start a discussion

What are some more challenging SDI questions to practice before your interview? Was this article helpful? Let us know in the comments below!

Top comments (5)

Jon Randy 🎖️ • Dec 23 '22

Don't learn or practice things just for interviews - that really is terrible advice. You should be confident in your ability to do things that the job requires, and go into interviews relaxed in that knowledge. Last minute cramming for an interview will only serve to make you more nervous and prone to make mistakes

Hunter Johnson • Dec 23 '22

This article has nothing to do with "last-minute cramming" (as that phrase is never remotely mentioned) and everything to do with providing three examples of challenging and popular SDI problems that may arise in some way or another in an interview setting. Practicing design problems related to major FAANG systems is common for those training their System Design knowledge. Many people feel the need to learn these design problems not because their interview is the very next day but because nobody gets good at anything without some degree of practice and training. It is very reasonable for someone nervous about the talking-dominated setup of the SD interview to want to practice talking through a design solution so that they feel more confident when the day of the interview arrives. Thanks for the comments and for providing your insight, but claiming that we're promoting "last-minute cramming" is not appreciated, as that is nowhere close to the content of this piece.

Jon Randy 🎖️ • Dec 23 '22 • Edited

Sorry if I maybe didn't express myself very well. My point is that if you're going for a job where you feel your abilities are not up to the position at the point when you get the interview - then revising/practicing for the interview itself really isn't going to serve you that well.

The best advice I ever received regarding revision for any kind of subject was:

Revision is for wimps

This was given to me be my A-Level Physics teacher... essentially saying that there was no substitute for actually knowing the stuff and being comfortable with it well in advance. To be honest, I wish I'd been given this advice earlier. Every test/exam I've ever had since then I've gone into feeling relaxed and ready - not nervous about what I did/didn't know or what might appear in the questions.

You stated that:

...This article has nothing to do with "last-minute cramming"

but the title says "3 challenging System Design problems to learn for your interview". If you are learning something 'for your interview' - that would seem to suggest something fairly last minute?

Hunter Johnson • Dec 23 '22 • Edited

Thank you for clearing that up and sharing your valuable experiences! I can see how the title could be misleading, but the point was not to suggest that a candidate cram these problems in "before" their interview; merely that they are beneficial practice problems to include in a preparation regimen "for" their future interviews. The "for" was only meant to indicate that these practice problems could be helpful to those who are prepping for an upcoming SD interview. I'm sorry again for the confusion (I'll probably be adjusting the title per your very helpful suggestion). For those readers who are interested in System Design interview prep problems (experts or beginners), these 3 problems are great examples of challenging SDI questions. Obviously, people prepare for interviews in different ways; but sometimes, even the most advanced candidates can benefit from a little practice to maintain their level of knowledge. If you're a future SDI candidate that does benefit from practice and repetition, we have 3 great options for you to start with right here!

Hunter Johnson • Dec 23 '22

Also, if you're an SD beginner and/or not currently prepping for an SD interview, but hope to be in the future, then these 3 example interview problems are a great place to start honing your skills and knowledge of the concepts you'll one day have to explain in an interview.

DEV Community

3 challenging System Design problems for your interview prep

1. Designing Uber

2. Twitter's heavy-hitter problem

3. Capacity estimation problems

Continue practicing with more SDI questions today!

Continue learning about how to conquer the System Design interview on Educative

Start a discussion

Top comments (5)

Read next

GUI in GitHub Codespaces

Turbinando sua Produtividade: Autocomplete e Personalização no Terminal do Windows

How to showcase your GitHub repositories on LinkedIn 🐱

Organize your Dotfiles - with GNU Stow and Git