Notes about The Chapter 01 of Web Scalability For Startup Engineers (Image by Herbert Austfrom Pixabay)
Last week, I wanted to learn more about system design and scalability. So, I started by watching YouTube videos and reading LinkedIn posts to get a general idea. I also looked into book reviews, and it seems like “Designing Data-Intensive Applications” is a popular choice among many people.
I wanted to begin with something simpler since “Designing Data-Intensive Applications” covers a lot of detailed topics. After talking to a friend and hearing recommendations from others in our field, I found out about “Web Scalability for Startup Engineers.” This week, I started with the first chapter, which gave an overview of the topics the book will cover.
Notes:
What is Scalability:
the ability of our system to handle more data, request, users and transactions , we must be able to scale up and down in a cheap and quick way.
Scalability Dimensions:
- Handling more data : storing more content, with popularity if data analytics and big data this play an important role.
- Handling higher concurrency levels: how many users (open connections, active threads, messages being processed at the same time) can our system server at the same time, how can we solve the problems of servers having few processing units, ensure parallel execution of code to ensure data consistency.
- Handling higher interaction rates : how many times clients exchange communication with the server , must be able to handle responses quicker, faster read and writes and higher concurrency.
Scalability vs Performance:
Scalability is related to performance; Scalability determines the capacity to handle more users, Performance refers to how swiftly the system handles requests under load, such as the speed at which it can respond to 100 user requests every 5 seconds.
Scalability is also implied when it comes to team members, the more people a team the harder the communication.
DNS:
Is usually hosted in a different server , the customer connect to it get the dns server get the ip address of the domain and then he start requesting content from our server.
VPS:
Is virtual machine for rent; hosted with other virtual machines in a one physicall machine, this aproach is not good
Single Server configuration:
this option is better but we might switch if:
- Our user base grow, and it will take more cpu, i/o to serve our users
- Database grow we added alot of data , queries will take more time to execute so we need more cpu and I/O power.
- Added new features which are going to make users interact more with the system, and this will need more resources.
How Can We Scale Vertically:
1- Adding more I/O capacity by adding more hard drives in RAID arrays:
RAID arrays: is a set of hard or solide state drives linked together to form a one logical storage, to protect data in case of failures, here is some popular RAID config used:
- RAID 0( None drive can fail) : the data is split evenly between two drives
- RAID 1( 1 drive can fail) : mirroring, at least two drive have the exact copy of data , if a disk fail others will be able to continue working
- RAID 5( 1 Drive failure): Stripping with parity, requires the usage of 3 drives, splitting the data across multiple derives , but also has parity distributed across the drives
- RAID 6( 2 drives failure) : Stripping with double parity, similar to RAID 5, but the parity is written in two additional drives
- RAID 10 ( up to one drive in each array ): combine RAID 1 and RAID 0, mirror all data in secondary drives, and use stripping acrros each set of drives to speed up data transfers
Parity: is a calculated value stored in a drive from other drives in the array , used to reconstruct the data in a case of failure of one of the drives.
2- Improve I/O access time by switching to solide state drives (SSD):
SSD is fasster than hdd , but for databases it is not the case because alot of databases like mysql are optimized for sequentiel read , and databases like cassandra goes further and use mostly sequentiel read.
Sequentiel disk operations is like reading a disk page by page, while radoù disk I/O is like picking a random page in a book each time, ssd is so much faster than hdd when it comes to random , and also sequentiel because of the lack of physical head.
3- Reducing I/O by increasing RAM:
More caching space, more memory for the app to work in , better for DB because they cach frequently accessed data in ram.
4- improve network throughput:
by upgrading network interfaces, upgrade network adapters , upgrade providers.
5- switching to a more powerful server:
a server with more virtual cores, server with 12 or 24 threads( virtual cores), processes do not have to share cpu, cpu will perform less context switches.
Vertical scalability is a simple approach because we dont have to rearchitect anything we just need to upgrade our hardware and this come with cost.
OS may prevent use from scalling vertically also, in some databases increasing CPUs won’t do any improvements because of increasing lock contention.
Locks: are used to sync access between threads to a specific resources like memory or files, lock contention happen when a one lock is used for a big resources that has a lot of operations, to solve this fine grained locks must be introduced which create a much more specific locks for each task in the resource and allow threads to access the resource more effitiantly. Therefore adding more cpu when lock contention happen does not have any significant impact.
We should Design app with high concurrency in mind, so when we add more cores it is not going to be a waste.
Isolation Of Services:
Is the separation of services (db, ftp, dns, cache, web server ) into multiple phsysicall servers, when we do this we have no room to grow.
Deviding a system based on functionality to scale is called functional partitioning, for example dividing the admin service and client service in multiple physicall servers.
CDN: content delivery network , a hosted server that cares of the global distribution of static content like js, css, images ,videos, it works as an http proxy, if a client need to download a static content the cdn check if he has the content if not he request it from the server and cach it , and other clients will be server from cdn without even contacting the server.
Horizontal Scalability
It has to be considered before the application is built,systems that are true horizontally scalable does not need a powerful machine, they usually run on multiple cheap machines, and you can add as much servers as you want, you avoid the high price of buying top tier hardware , and vertical scalability ceiling problem( no more powerful hardwares).
Scaling with multiple data centers in case of having a global audience is important as it provide protection against rare outage events , and client in other countries can get a response faster.
Scaling horizontally with webservers and caches, is easier than scaling persistence stores and databases.
The usage of a round robin dns is the choice if we use multiple web server, this will distribute traffic between servers, what round robin dns does is that he just get a domain name and allow use to map the domain name to multiple ip addresses, when a client send a request the round robin dns map the client request to a server , a two clients might connect to different servers without even realising
A data center infrastructure:
- Frontline: first part that user devices interactwith, does not cotaine any business logic, help us, can exist in or outside of our datacenter,
- First client send a request , geoDNS to resolve the domain name and send the closet load balancer ip address ; then it get distributed to frontend cach server or directly to frontend app server
- CDN, Load balancers, reverse proxies , can be used and hosted by third parties,
Load balancer: is a hardware or software that allow the addition and removal of servers dynamically, it also help in distributing traffic to multiple servers.
Reverse proxy: is an intermediate between client request and the actual server , can be used as a load balancer , or for web acceleration where compress inbound and outbound data and also cache content and also do the ssl encryption which boost main server performance (he have proxy taking care of side tasks for him), can act as a server who preserve the anonymity and encapsulate how the real internal structor look like because clients can access our data center or internal structor from a one record locator or url.
Edge cache server: is http cach server located near customers, can cache an entire http request , it can cach an entire page, or partially cache it and delegate some other parts to the server, it can also decide that the page is not cachable and delegate the entire page to the server.
A single data center can scale using edge cach and cdn, and It is not necessary to use alot of components and technologies to scale , instead we should use what is necessary
The application architecture should not involve around technologies (programming languages, databases) , it should focus on the domain model to create a mental picture of the problem that we are trying to solve.
The Frontend Must be kept as dumb as possible, and allow it to use message queues , and the cache backend , caching the html page along with the database query is more efficient than just caching the database query, Web services is critical part of the application as it contains the most important parts of our business logic.
Servers might have job processing servers or job running on schedule, with the goal of handling notifications, order fullfilement or some other high latency tasks.
SOA: service oriented architecture focused on solving business needs , where each service has very clear contract and use the same communication protocols.
SOA has some alternatives; layered architecture, hexagonal architecture, event-driven architecture.
Multi layer architecture is a way to represent functionality in a form of different layers, components in lower layer expose functionality to upper layer , lower layers can never depend on the functionality of a top layer.
In a layered architecture, having richer features in a specific layer usually leads to a more stable API. Conversely, simpler features may result in a less stable API, as changes to lower layers’ APIs can be costly due to dependencies from many other APIs.
Hexagonal architecture assume that the business logic of the app is the center of the app , there is a contract between the business logic and non business logic component , but no layers, the main reason of this is that we can replace any of the non business logic component at any time without effecting our core app.
Event-Driven Architecture (EDA) shifts the focus from responding to requests to handling actions. It works by creating event handlers that wait for an action to occur and then react to it.
In all the architectures dividing the app into smaller units that function independently will show a performance benefits, we can think of these web services as a an autonomous applications where each one become a separate app, each app hide it’s implementation details and present a high level api.
Message queue, app cache , main datastore, … , we should think of them as plugins that we can replace at any time with another technology.
Isolating third party services is a good for us as we dont know if they are scalable, or if we have full control on them , so isolating them and make it possible to replace them later is beneficial.
Top comments (0)