Hi, I'm Valerio software engineer, CTO and founder at Inspector. In this article I’ll talk about what I learned trying to increase the operational limits of the Inspector SQL database.
Before talking about read-replicas or sharded-data, it might be helpful to introduce the problem to solve and the most common strategies to improve the ability of the database to process an increasing number of queries per second, opnenig up new growth margins for the application.
Why so many performance issues are caused by the database?
We often forget that each request or process performed by the application is not “atomic”. When we write a piece of code we should keep in mind that that piece of code will be executed by 10 other requests at the same time. If it's slow, it’s likely to affect other requests running in parallel, and at the end the whole system.
Database is a shared resource used by all processes and all servers behind your application. So, be careful with thinking "it's okay if this piece of code isn't optimized". Even just one poorly designed access against the database can hurt the performance of the whole system.
What problem does scaling the database solve
Before proceed we should clarify a difference between vertical vs horizontal scaling because both are useful but suited to solve different problems.
If you deal with large datasets, import/export processes, data aggregation algorithms, and similar problems you probably can get more benefits scaling your database vertically.
Resource-intensive tasks usually don’t have great benefits from multiple instances. When a long running query is executed against an instance, that instance must have enough resources to execute the query without hurt the performance of other tasks.
For this type of database operations you should correctly size the RAM and CPU of the machines and take care of the performance of the SQL tasks you write.
You can learn more on how to tune your SQL queries to be able to run very heavy queries with better performance on this article: How to accelerate application performance with smart SQL queries.
Horizontal scaling instead aims to solves the opposite problem: allow the database to performs a big big number of small queries per second.
It is simply a matter of finding a way to run the database on multiple instances. Just like a load balancer that distribute the traffic between application servers. But, when you take a closer look to new database architectures, and see what’s actually working and what’s not, the fundamental problems with relational databases start to become more clear.
Achieving scalability is a huge challenge for relational databases
Relational databases are originally designed to run on a single server in order to maintain the integrity of the table mappings and avoid the typical problems of distributed computing.
To handle these concerns, relational database vendors have come out with a whole assortment of improvements, using more complex architectures like “master-slave” or “sharded-data”.
Below we discuss about "read-replicas" that is the Inspector implementation and it is the most used strategy to horizontally scale a relational database.
A read replica is a copy of the primary instance that reflects changes to the primary in almost real time. You can basically use a read replica to do the following:
- Offload read requests from the primary instance;
- Perform a regional migration or fail over to another instance for disaster recovery purposes.
- We use the read replica to distribute the load against two instances instead of one.
While it looks cool, this setup still has a weakness. The second instance is readonly.
If we can write on the second instance, she too would have to synchronize the data back to the first one (the master), but two-way synchronization is not allowed in relational datanases to avoid data corruption.
We can offload only the “read” queries from the primary instance, so it is the right solution if your application is “read” intensive. If your application is write intensive this architecture couldn’t be enough.
- It is very fast as doesn't impose any restrictions on read performance.
- You can split read and write requests to different servers. For example, all analytics queries can be made on Slave nodes.
- It is not very reliable because of asynchronous replication. It means that some committed on master transactions may be not available on slave if the master fails.
- Write requests can hardly be scaled. The only option to scale write requests is to increase compute capacity (RAM and CPU) of the Master node.
Master-Master replication can enable two way synchronization allowing us to write on both instance, but consider that even the biggest cloud computing providers doesn’t provide this option in their managed SQL database offer. It is probably a tunnel in which it is better not to enter.
We are a small team, so in order to keep our infrastructure management flexible and cost friendly we migrated our MySQL instance to Google Cloud SQL that offer the possibility to create/delete read replicas with a few clicks. We can focus our effort on the application development instead of SQL server issues.
Inspector is backed by Google Startup Program that allow us to experiment a lot with its cloud platform to find the right solutions for our needs. Anyway all the most known cloud providers offer their managed SQL database with the ability to easily add read-replicas to your master instance like Amazon RDS or DigitalOcean managed databases.
It's all about to better understand the architecture and how your application can get more advantages as possible of it.
One of our code design constraints is to avoid database write operations as much as possible to stay away from the most important limit to scalablity of a SQL database.
Hope this article was helpful to bring new ideas to imporve your application.
If you want learn more about how to scale up your application, join in Scalable Applications the international community of professional developers to share strategies and experiences building scalable systems.