This is the third installment of my introduction to System Design, as I prepare for an interview that requires knowledge of System Design! I hope this serves you well as a starting point for your dive into System Design!
In the previous two blog installments, I covered:
- System Design Principles
- Monolith Architecture
- Microservice Architecture
- High Level System Components (Servers, Caches, Databases, Load Balancers)
Looking at a web service that may or may not exist, Hopefully you might be able to:
- choose a high level backend architecture to use
- keep the 5 system design principles in mind
- break down the web service into general components
However, one key part I haven't really covered is optimizing a design. Such as how you can tailor your design for efficiency? For reliability? For scalability? In order to understand some of the options available for you to optimize your design, I'll briefly go over some of the high level strategies I've learned:
- Horizontal Scaling vs. Vertical Scaling
- CAP Theorem
- Single Region vs. Multi Region
- Horizontal Partitioning vs. Vertical Partitioning
- Redundancy & Replication
This is probably the most widely used term in System Design. They are relatively simple to understand.
Horizontal Scaling simply means that in order to increase your system capacity, you install more units (e.g if you had N servers, you now have N + 1 servers). You can horizontally scale any part of your system, but this works particularly well with distributed systems, and it is one of the primary reasons they are used. They can meet large scale demand at lower cost and higher reliability than monoliths.
Vertical Scaling means that in order to increase your system capacity, you upgrade your existing units (i.e replacing your computer with a faster, better one). This type of scaling is usually used in monoliths, but you can vertically scale parts of your system too.
More often than not you will be designing a distributed system (multiple services that are totally independent from each other). When designing a distributed system, you can't have it all, you need to decide on some tradeoffs.
The CAP Theorem states that any distributed systems can only provide 2 of the 3 characteristics: Consistency, Availability, and Partition Tolerance
Consistency means that ANY particular request being served from ANY server will not show dissimilar data.
Availability means that your servers can take requests 100% of the time.
Partition Tolerance basically means that your system will never have issues communicating with each other.
I was trying to explain this concept to my partner, and she gave me an example from her work that illustrates CAP Theorem.
Imagine you're working on the same excel spreadsheet (database) with multiple coworkers (servers):
If you all made sure that everything you updated was in sync with your coworkers, and that there were no inter-communication problems, you wouldn't be able to do your work 100% of the time, because you'd have to wait occasionally to resolve inter-communication problems. (Consistent, Fault Tolerant, but NOT Available)
If you made sure that all your coworkers could work on the spreadsheet 100% of the time and also ensured that they wouldn't have any inter-communication problems, you could potentially be dealing with inconsistent data. (Available, Fault Tolerant, but NOT Consistent)
If you made sure that your coworkers could work on the spreadsheet 100% of the time and also made sure they were checking in with each other before working on the spreadsheet, you'd be susceptible to inter-communication problems. (Available, Consistent, but NOT Fault Tolerant)
If designing a distributed system, keep the CAP Theorem in mind!
When do you decide whether or not you need a cache? If you have a high volume of requests for common data (like a request to see a celebrity post), it's a good idea to use a cache.
Additionally, when it comes to optimization of a cache, here are some considerations.
- Where should I put my cache?
- What should my cache eviction policy be?
- How should I update my cache?
Cache is quick-access memory, so you can either move it closer to the server handling the request, or move it closer to the database.
Closer to the Server: If the Cache is stored in the server, this could potentially increase response times (since we wouldn't need to check for data all the way back in the database). However, you'd need to make sure (in a distributed system or a horizontally scaled service), that each cache has consistent data.
Closer to the Database: If a Cache is located near a database, this would reduce response times, but you'd get the data consistency you'd need because the cache could be a separated from your server(s). This cache could also be scaled independently.
Since caches are typically small, they might get full quickly, and you might need to push out data according to specific criteria (pulled directly from Grokking the System Design Interview):
- First In First Out (FIFO): The cache evicts the first block accessed first without any regard to how often or how many times it was accessed before.
- Last In First Out (LIFO): The cache evicts the block accessed most recently first without any regard to how often or how many times it was accessed before.
- Least Recently Used (LRU): Discards the least recently used items first.
- Most Recently Used (MRU): Discards, in contrast to LRU, the most recently used items first.
- Least Frequently Used (LFU): Counts how often an item is needed. Those that are used least often are discarded first.
As a re-iteration, caches allow you to store frequently accessed data from your database in a small localized storage for performance. That being said, Caches need to have some sort of mechanism to remain in sync with the database (or mark data that isn't consistent. This is called cache invalidation.
Write-Through Cache: new data is written into the cache and database at the same time. This allows for consistent data, quick reads, but slow writes.
Write-Around Cache: new data bypasses the cache and goes straight to the database. The cache will be populated by some other means. This allows for quicker writes, but also has the risk of creating a thing called a "cache miss", where lookup data is missing from the cache and has to go all the way to the network.
Write-Back Cache: new data goes into the cache and eventually will be put into the database. This increases speed but has the risk of potentially losing data if say... the power goes out.
When deciding on a distributed system, you may want to distribute your system in one region or in many. This is mainly a question of reliability vs cost.
Why would I want to spread my system across the globe? Reliability. If one region system experiences something like a Hurricane, your system can "failover" to another region.
This was something that came up in a mock-interview of mine, so I thought I'd bring this up here!
When you design your database, you may want to optimize it by partitioning, or breaking it up. This might be because you want to scale, balance database loads, or availability etc.
Horizontal Partitioning: If we decide to split our database up by the rows in the table (or by a particular attribute of the table), this is horizontal partitioning. The drawback to this is that if our criteria for how to partition the rows isn't good, you could have a "shard" or "partition" of the original database being loaded very hard and others being underutilized.
Vertical Partitioning: If we split our database by its particular models (let's say we have a users table, tweets table etc), this is vertical partitioning. The drawback to this is that you may need to further partition your shards if the demand for the database data goes up (because a single database may not be able to handle the cost).
Imagine you have a giant database that you need to look through. Sometimes having a sort of table of contents would help us index directly to the right place in memory to grab data. An index serves as a separate table that stores references to the database to increase our read speeds!
Sounds great, why wouldn't we use indexing?
Well, in order to have a new record, you'd also need to generate a new/unique index. This would decrease the latency for our write operations.
Often when you're designing a system of any kind, there might be "weak links" that might take your whole system down if they were compromised. These are called single points of failure. Redundancy fixes single points of failure by having a backup of whatever that thing is, be it a database, a server, or even a load balancer.
The common scheme for implementing this is called the Master-Slave system. Basically the masters gets every update and eventually moves them over to the slave.
System Design goes way beyond the basics that I went over during these blog posts. Honestly, I don't really know much more than you!
But if you are looking to really accelerate your System Design education from this point, here's what I'd recommend:
- Watch some sample System Design interview questions like this one or this one or this one!
- Go through the entire video and note down the terms you don't understand
- Google the terms, and try to understand them with respect to what you already know
- Start thinking about how you'd build some of these services! Draw it out!
- Continue with the deep dive on System Design concepts and if possible, incorporate them in your own projects!
Good Luck with your learning! Stay healthy!