Over my 10 years as a senior software engineer and interviewer at Microsoft and Facebook, I've worked with hundreds of applicants as they solve different system design problems.
Developers tend to struggle with SDI questions because they are so open ended and often require a kind of critical thinking not practiced in other coding interview challenges.
While SDI questions change over time, some have remained popular in interviews across various top companies.
Today, we'll explore the top 10 most commonly asked system design interview questions, common problems you'll have to address in each, and some tools to help you do that.
Here’s what we’ll cover today:
- Tips for any question
- 1. Design a chat service
- 2. Design a ride-sharing service
- 3. Design a URL shortening service
- 4. Design a social media service
- 5. Design a social message board
- 6. Design a file storage service
- 7. Design a video streaming service
- 8. Design an API Rate Limiter
- 9. Design a proximity server
- 10. Design a Type-Ahead service
- What to learn next
Start each problem by stating what you know: List all required features of the system, common problems you expect to encounter with this sort of system, and the traffic you expect the system to handle. The listing process lets the interviewer see your planning skills and correct any possible misunderstandings before you begin the solution.
Narrate any trade-offs: Every system design choice matters. At each decision point, list at least one positive and negative effect of that choice.
Ask your interviewer to clarify: Most system design questions are purposefully vague. Ask clarifying questions to show the interviewer how you're viewing the question and your knowledge of the system's needs.
Discuss emerging technologies: Conclude each question with an overview of how and where the system could benefit from machine learning. This will demonstrate that you're not just prepared for current solutions but future solutions as well.
For more information on how ML can improve your SDI performance, check out How Machine Learning gives you an edge in System Design.
For this question, you'll design a service that allows users to chat with each other over the internet. Conversations can be one-on-one or can be group chats with many members. Messages should only be accessible by those included in the conversation.
- Messages must be sent and received via the internet.
- Service must support one-on-one and group chats.
- Messages should be stored for later viewing.
- Users should be able to send pictures and videos as well as text messages.
- Messages should be encrypted during transit.
- Messages should be visible with minimal latency.
- What happens if a message is sent without an internet connection? Is it sent when the connection is restored?
- How will you encrypt and decrypt the message without increasing latency?
- How do users receive notifications?
- Are messages pulled from the device (server periodically prompts the devices if they're waiting to send a message) or are pushed to the server (device prompts the server that it has a message to send)?
- Split the database schema into multiple tables: user table (with the user ID and contacts), a chat table (with chat IDs and a list of participating user IDs), and message table (with past messages a reference to the chat ID).
- Use WebSocket for bi-directional connections between device and server.
- Use Push notifications to notify members even if they're offline.
This question asks you to create a ride-sharing service that matches users with nearby drivers. Users can input a destination and transmit their current location and nearby drivers are notified within seconds.
The app then tracks a route between the driver and user's current locations, then from the user's location to the destination.
- The system must track the current location of all users and drivers.
- Users and drivers must receive updated trip information while in transit.
- Must support thousands of users at various points in the process and scale accordingly.
- Both driver and user must be constantly connected to the server.
- How can you keep latency low during busy periods?
- How is the driver paired with the user? Iterating all drivers to find Euclidean distance would be inefficient.
- What happens if the driver or user loses connection?
- How do you store all cached location data?
Check out our guide to Uber Data Science interviews for more information on Uber interview process.
- Use the S2Geometry library to split locations into cells. Only calculate driver distance with drivers in the same cell as the user.
- Use distributed storage to store locations of all users, location data will only be roughly 1Kb per user.
- If location data halts, the device continues to report their previous location while waiting for reconnection.
- Allow a buffer after prompting the closest driver to take a trip. If they refuse, move to the next driver.
This question asks you to create a program that shortens long URLs like TinyURL or
bit.ly. These programs take a long URL and generate a new and unique short URL. They can also intake a shortened URL and return the original full-length URL.
- Returns a URL that is shorter than the original
- Must store the original URL
- Newly generated URL must be able to link to the stored original
- Shortened URL should allow redirects
- Must support custom short URLs
- Must support many requests at once
- What if two users input the same custom URL?
- What if there are more users than expected?
- How does the database regulate storage space?
- Use hashing to link original and new URLs
- Use REST API to load balance high traffic and handle front-end client communication
- Use multithreading to handle multiple requests at once
- Use NoSQL database to store original URLs (no relation between stored URLs)
Learn how to solve this problem in our step-by-step guide Design TinyURl and Instagram
For this question, you'll design a social media service used by my hundred thousand users like Instagram. Users should be able to view a newsfeed with posts by followed users and suggest new content the user may like.
Interviewers often want to hear you discuss the newsfeed in depth.
- Robust newsfeed and recommendation system
- Users can make public posts
- Other users can comment or like posts
- Must comfortably accommodate many users at once
- System must be highly available
- Famous users will have millions of followers, how are they handled vs standard users?
- How does the system weight posts by age? Old posts are less likely to be viewed than new posts.
- What's the ratio of
writefocused nodes? Are there likely to be more read requests (users viewing posts) or write requests (users creating posts)?
- How can you increase availability? How does the system update? What happens if a node fails?
- How do you efficiently store posts and images?
- Use rolling updates and replica nodes to maximize availability.
- Use a trained machine learning algorithm to recommend posts.
- Create a database schema that stores celebrities and users separately.
- Use a social graph to further track following habits
Learn how to solve this problem with our step-by-step guide Design TinyURL and Instagram
For this question, you'll design a forum-like system where users can post questions and links.
Other users can view and comment on the questions. Questions have tags that represent their topic and users can follow tags to see questions on specific topics. Users have a newsfeed that highlights popular questions from their followed tags and related topics.
- Users must be able to create public posts and apply tags
- Posts must be sortable by tag
- Other users must be able to post comments in real-time.
- The database must store data on each post (views, upvotes, etc.)
- The newsfeed must display posts from followed tags AND posts from other tags that the user will like.
- Must support high traffic of viewers and new posts.
- Does our product only need to work on the web?
- Where are user uploaded images/links stored?
- How will the system determine related tags? How many posts from unfollowed tags are shown in the feed?
- How are posts distributed across a network of servers?
- Use an SQL database to map the relational data (users have posts, posts have comments/likes, categories have related posts, etc.)
- Use multithreading and a load balancer layer to help support higher traffic.
- Use sharding to break up the system. Consider sharding by category to store posts of the same tags in one machine.
- Use Machine Learning and Natural Language Processing to find correlations between the relationships between tags
For this question, you'll create a synchronous, cross-platform storage system like Dropbox. Users can store files and photos and access them from other devices.
- Users should be able to save/delete/update/share files over the web
- Old versions of documents should be saved to rollback
- Files updates should sync across multiple devices
- Where are the files stored?
- How do you handle updates? Do you re-upload the entire file again?
- Do small updates require a full file update?
- How does the system handle two users updating a document at the same time?
- Use chunking to split files into multiple sections. Updates only re-upload the section rather than the whole file.
- Use cloud storage like Amazon S3 to handle the internal database.
- Make the client constantly check with the server to ensure concurrent updates are applied.
This question asks you to create an online video streaming service like Youtube. The service will store and transmit hundreds of petabytes of video data. It must also store statistics (views, likes, number of views, etc.) and allow for users to post comments.
Your solution must be scalable to support thousands of concurrent users.
- Videos should be uploadable over the web
- Users should receive an uninterrupted stream over the internet
- Video statistics should be stored and accessible for every video.
- Comments must be saved and displayed with the video to other comments
- Should support high traffic of several thousand users
- How will your service ensure smooth video streaming on various internet qualities?
- How does your service respond to a sudden drop in streaming speed (buffering, reduced quality, etc.)?
- How are the videos stored?
- Use cloud technology to store and transmit video data.
- Use Machine Learning to suggest new video content.
- Prevent stuttering for inconsistent connections with a delay. The user views data from a few moments ago rather than as it comes in.
For this question, you'll create an API Rate Limiter that limits the number of API calls a service can receive in a given time period to avoid an overload.
The interviewer can ask for this at various scales, from a single machine to an entire distributed network.
- Devices are limited to 10 requests per hour
- The Limiter must notify the user if their request is blocked.
- Must handle traffic suitable to its scale.
- How does your system measure requests per hour? If a user makes 10 requests at 1:20 then another 10 at 2:10, they've made 20 in the same 1-hour window despite the hour change.
- How would designing for a distributed system differ from a local system?
- Use sliding time windows to avoid hourly resets.
- Save a counter integer instead of the request itself to save space.
For this final question, you'll design a proximity server that stores and reports the distance to places like restaurants. Users can search nearby places by distance or popularity. The database must store data for 500 million places across the globe but have low latency.
- Store up to 500 million locations.
- Locations must be uniquely identified and have corresponding data like a quality review and hours of service.
- Searches must return results with minimal latency.
- Users must be able to search results by distance or quality.
- How do you store so much location data?
- How do you achieve quick search results?
- How does your system handle different population densities? Rigid latitude/longitude grids will cause varied responsiveness based on density.
- How can we optimize commonly searched locations?
- Use a relational database to store the list of locations and related data.
- Use caching to store data for the most popular locations.
- Use sharding to split data by region.
- Search locations within a certain dynamic grid. If there are more than 500 locations in a single cell, split the grid into 4 smaller cells. Repeat until you only have to search less than 500 locations.
This service will partially complete search queries and display 5 suggestions to complete the query. It should adapt to highly searched content in real-time and suggest that to other users.
For example, "Seahawks win the Super Bowl" would be suggested within minutes of the event occurring.
- The service should match partial queries with popular queries.
- Minor spelling mistakes should be corrected, i.e. "dgo → dog"
- Should guess the 5 most likely options based on the query
- Results should update as the query is being written
- How strong do you make the spelling mistake corrections?
- How do you update selections without causing latency?
- How do you determine the most likely completed query? Does it adapt to the user's searches?
- What happens if the user types very quickly? Do suggestions only appear after they're done?
- Use a natural language processing machine learning algorithm to anticipate the next characters.
- Use Markov Chains to rank the probability of top queries.
- Update ML algorithm hourly or daily rather than in real-time to reduce burden.
These questions should help you understand what types of problems you'll be expected to solve at a system design interview. Practicing solving and explaining questions like these is the most efficient way to prepare for your next interview.
To give you hands-on practice with these solutions, Educative has created Grokking the System Design Interview. This course will give you detailed walkthroughs on these and other questions, written by current industry interviewers.
By the end, you'll know exactly what modern interviewers are looking for, what clarifying questions to ask for each question, and will have extensive practice in explaining your tradeoff decisions.