How do you design a photo-sharing service like Instagram?
Performance, availability, consistency, scalability, reliability, etc., are important quality requirements in system design. We need to analyze these requirements for the system.
As a system designer, we might want to have a design that will be highly available, very high performant, top-notch consistency in the system, highly secured system, etc. But it’s not possible to achieve all these targets in one system. We need to have requirements that will work as restrictions on the design of a system. So, let’s define our NFRs:
Our system should be highly available. In the case of any web service, it’s a mandatory requirement. Home page generation latency should be at most 200 msec. If the home page generation takes too long, users will be dissatisfied, which is not acceptable.
As we choose for the system’s high availability, we should keep in mind that may hamper consistency across the system. The system should also be highly reliable, which means any uploaded photo or video by users should never be lost.
In this system, photos search and views would be more than uploading. As the system would have more read-heavy operations, we will focus on building a system that can retrieve photos quickly. While viewing photos, latency needs to be as low as possible.
If you are not sure where to start in a system design, always start with the data storing system. It will help to keep your focus aligning with the requirements of the system.
We need to support two scenarios at a high-level, one is to upload photos, and another is to view/search photos. Our system would need some object storage servers to store photos and some database servers to store metadata information.
Defining the database schema is the first phase of understanding the data flow between different components of the system. We need to store user profile data like the follower list, uploaded photos by users. There should be a table that stores all data regarding photos.
As we use multiple copies of a server, we need to distribute the user traffic to those servers efficiently. The load balancer will distribute the user requests to various servers uniformly. We can use IP-based routing for the Newsfeed service, as the same user requests go to the same server, and caching can be used to get a faster response.
If there are no such requirements, round-robin should be a simple and good solution for the server selection strategy of load balancers.
We have a lot of services for our system. Some will generate newsfeed, some help storing photos, some viewing the photos, etc. We need to have a single entry point for all clients. This entry point is API Gateway.
It will handle all the requests by sending them to multiple services. And for some requests, it will just route to the specific server. The API gateway may also implement security, such as verifying the client’s permission to perform the request.
The newsfeed of users can be a large response. So, we may design the API to return a single page of the feed. Let’s say we are sending at most 50 posts every time a feed request is made.
The user may send another request for the next page of feeds for the next time. And within that period, if there is not enough feed, Newsfeed may generate the next page. This technique is called pagination.