Anyone who has worked with large clusters knows that data constantly keeps growing. Sooner or later, developers of distributed systems face the challenge of scaling. Nowadays, it is hardly a problem to find space for data storage. But how to deal with application adjustments and configuration? You can avoid adjustments if the system allows for scalability inherently. You can differentiate application nodes by feature and deploy only the ones you need.
Hello, my name is Igor, and I am a part of the Tarantool DB team. We have extensive experience in developing high-performance products, for example, data storage systems for major retailers and mobile network operators. Today, I will talk about our approach to cluster scaling and provide a typical example. This article is targeted at anyone who deals with big data and considers scaling.
It is an extended version of the talk I delivered at SaintHighload++ 2021.
Challenge
When developing data-intensive applications, we pursue three objectives:
Scalability. We build up data and business logic.
Reliability. We need redundant nodes to access the data. We also need a mechanism to keep the cluster always writable, even if some nodes fail.
Supportability. We need tools for management and monitoring. We also want to be able to expand the cluster without changing the application code.
Moreover, we want to use tools that help standardize cluster application development. Then we wouldn't have to write code from scratch every time, and we could build new applications into existing pipelines and share expertise between projects.
Cluster architecture
Any cluster starts with its clients — applications or systems that will interact with the cluster, write data, and read it.
When client requests first get to the cluster, each one of them goes through the router. This node redirects incoming requests to other nodes. It is the router that connects clients with their data.
Next, the router sends client requests to storages, which fall into another category of cluster components. As the name implies, storages store data, write it and send it back upon the router's request.
Each cluster node performs its role, i. e., it follows a certain logic. When scaling a cluster, we can add a new application node and assign one of the available roles to it.В For example, we can add a node with the "storage" role if we need more storage space. Other storages will take care of redistributing data to the new node, and the router will start sending requests there.
The same goes for routers. If we need to increase the number of nodes accepting client requests, we have to add a new node to the cluster with the "router" role and make sure that requests pass through it. Here all routers are equivalent, and they send requests to all storages.
The next important cluster component is the failover coordinator. This cluster node keeps other nodes writable. It also writes the cluster state to an external storage, which is called the state provider. It must be independent of the rest of the cluster and meet reliability requirements.
We can add nodes with new roles to expand the cluster. For example, we need to replicate data from an external storage for cluster operation. In this case, we need to add a node that executes this logic. It will connect to the external storage and send data to the routers to distribute it across the cluster.
We can use Cartridge to create a cluster.
Cartridge
Cartridge is a framework for developing cluster applications. Cartridge manages the cluster, controls sharding, allows implementing business logic and data schemas using roles, and provides failover to keep application instances writable.В Cartridge has many additional tools that you can build into CI/CD pipelines and use to manage the running cluster.В For Cartridge, there is also an Ansible role, an integration testing module, a utility for creating an application from a template and running a cluster locally, a module for collecting metrics, and a Grafana panel.
Let's now look inside Cartridge.
Tarantool
Cartridge is a framework powered by several instances of Tarantool, an in-memory computing platform. It is both an in-memory database and an application server written in Lua. Tarantool is very fast due to storing data in RAM but also reliable. It saves data snapshots to the hard disk and allows you to set up replication and sharding.
Replication
Tarantool instances can be combined into a replica set — a set of nodes connected by asynchronous replication. When configuring a replica set, you can specify which instance will be read-only. We will call the instances that accept writes "masters", and the rest — "replicas".
Here's an example of master replication configuration:
box.cfg{
listen = 3301,
read_only = false,
replication = {
'127.0.0.1:3301',
'127.0.0.1:3302',
},
}
And here's an example of a replica configuration:
box.cfg{
listen = 3302,
read_only = true,
replication = {
'127.0.0.1:3301',
'127.0.0.1:3302',
},
}
Sharding
Tarantool has vshard, a special module for data sharding. It allows dividing data into chunks to be stored on different application nodes.В
There are two entities in vshard: vshard.storage and vshard.router. Storages contain buckets with data. They also rebalance the data if new instances are added. Routers send requests to storage nodes.
In vshard, you need to configure routers and storages on each node. When expanding a vshard-based cluster, you have to either modify the application code or write a vshard wrapper, which will configure instances when new nodes are added.
Here's an example of vshard configuration on Tarantool instances:
local sharding = {
['aaaaaaaa-0000-4000-a000-000000000000'] = {
replicas = {
['aaaaaaaa-0000-4000-a000-000000000011'] = {
name = 'storage',
master = true,
uri = "sharding:pass@127.0.0.1:30011"
},
}
}
}
vshard.storage.cfg(
{
bucket_count = 3000,
sharding = sharding,
...
},
'aaaaaaaa-0000-4000-a000-000000000011'
)
vshard.router.cfg({sharding = sharding,})
Summary
Cartridge is powered by Tarantool functionality. To build a cluster, we take several Tarantool instances, configure replication, add sharding, and automate adding new nodes. In addition, Cartridge provides a few more features that are not available in basic Tarantool:
Failover.
Automatic sharding configuration when new nodes are added.
WebUI for cluster monitoring.
Cartridge instance roles
I have already shown that each instance in a cluster performs a certain role. Cartridge roles are Lua modules with a specific API that describe the application business logic.
What can roles be used for?
A storage for data tables.
Client HTTP API.
External data replication.
Metrics or call chain collection, instance performance monitoring.
Execution of any custom logic written in Lua.
A role is assigned to a replica set and enabled on each replica set instance. A role always knows if it is enabled on the master or the replica. You can control its behavior by changing the configuration without touching the role's code.В
API
return {
role_name = 'custom-role',
init = init,
validate_config = validate_config,
apply_config = apply_config,
stop = stop,
dependencies = {
'another-role',
},
}
Each role has a standard API:
The
init
function performs the initial role configuration.Вvalidate_config
checks if the configuration is valid.Вapply_config
applies the configuration.The
stop
function disables the role.В
A role has a list of dependencies
and will only start after all the dependencies in this list are initialized.
Predefined roles
Cartridge has several predefined roles: some are available out of the box, others are installed with different modules.
Built-in roles:
vshard-router
sends requests to data storage nodes.vshard-storage
saves and manages data buckets.failover-coordinator
controls cluster failover.
Other useful roles:
metrics
collects application metrics.crud-storage
andcrud-router
simplify interaction with sharded spaces.
Role dependencies
Role dependencies form the essential mechanism underlying all Cartridge applications. A role will only start when other roles, which it depends on, are successfully initialized. This means that each role can use functions of other roles, as long as they are specified in the dependencies
list.В
Let's look at an example. Suppose we have written our own custom-storage
role, which describes data tables creation. It will depend on the crud-storage
role to use simplified interfaces for receiving data and writing it to our storage. The crud-storage
role already depends on the vshard-storage
role, which takes care of data sharding. It looks like this in the code:
app/roles/custom-storage.lua
return {
role_name = 'custom-storage',
dependencies = {
'crud-storage',
},
}
cartridge/roles/crud-storage.lua
return {
role_name = 'crud-storage',
dependencies = {
'vshard-storage',
},
}
Cartridge guarantees the role startup order. So the custom-storage
role can use all the features of vshard-storage
and crud-storage
because they are initialized before it.
Entry point
The entry point of each Cartridge cluster instance is the init.lua file. To configure a cluster instance, the init file calls the cartridge.cfg
function, which expects the list of available roles and default Cartridge and Tarantool settings. The init file can be run from the console with Tarantool, as well as with cartridge-cli or systemd/supervisord.
tarantool ./init.lua
local ok, err = cartridge.cfg({
roles = {
'cartridge.roles.vshard-storage',
'cartridge.roles.vshard-router',
'cartridge.roles.metrics',
'app.roles.custom’,
},
... -- cartridge opts
}, {
... -- tarantool opts
})
Cluster scaling with roles
Let us consider a typical scaling scenario. Suppose we have a two-instance cluster: one has the "router" role, and the other has the "storage" role.
We add new replicas when we need extra backup: for example, there is a new redundant data center. Or when we want to increase the readability of replica sets: for example, if most read requests are sent to the replica.
We add new shards when the memory usage on instances is about 60-80%. We can do it earlier if we see it constantly growing on memory usage graphs and predict that the space will run out soon.
We add routers according to business metrics or when we see a high CPU load of about 80-100%. For example, the growing number of cluster users increases its load.
We add new roles when the application business logic changes or extends, or we need additional monitoring. For example, we can add the metrics or the tracing module to the cluster.
Note that you can add new instances to the cluster and enable available roles on existing replica sets — without changing the code. But to add new roles that didn't previously exist in the cluster, you need to add them to init.lua and restart the application.
Also, keep in mind that you can assign multiple roles to any cluster instance. Any instance can be a router, a storage, and a replica, as well as perform any custom logic at the same time.
Cluster configuration
The cluster configuration is the main way to manage a cluster. It stores the topology, vshard settings transferred to vshard.storage.cfg and vshard.router.cfg, and custom role settings. Each application instance stores the complete configuration. Cartridge ensures that all nodes have identical configurations at any given time.
Let us take a look at how the new configuration is propagated. Suppose the configuration changed on one of the nodes. It will propagate to all other instances after that.
Each instance receives the configuration changes and calls its validate_config
functions to check if the received configuration is valid. If at least one instance fails to validate the configuration changes, all other instances discard them, and the user who applies the new configuration gets an error.
If the configuration is valid, Cartridge waits for each cluster instance to confirm that the validate_config
check was successful, and the configuration can be applied.
Then Cartridge executes all apply_config
functions and applies the new configuration on each application instance. It replaces the old configuration on the disk.
If anything goes wrong when applying the configuration, and apply_config
returns an error, Cartridge will display the corresponding information in WebUI and suggest reapplying the configuration.
Failover
Replication guarantees that our data doesn't get lost, but we also need to keep the applications writable. For this, Cartridge provides the failover.В Failover works in a cluster that has instances with the failover-coordinator role for managing master rotating. The cluster should also have the state-provider instance, which stores the list of current masters. Two state-provider modes are currently supported: etcd and a standalone Tarantool instance called stateboard.
The failover-coordinator ensures that at least one master is available in the replica set.
If the master crashes or stops responding to requests, the failover-coordinator finds out about it and appoints one of the replicas in the same replica set as the master.
In this case, the information about the current master is saved in the external storage. So even if the old master is back up and running, it will no longer accept write requests. The new master will remain until it crashes or becomes unavailable.
Tools
How do you support Cartridge? We have a large ecosystem of tools:
cartridge-cli serves several purposes: creating an application from a template, running a cluster locally, connecting to instances, and calling administration functions.
For testing, there is cartridge.test-helpers, which allows you to create any type of cluster in integration tests.
The WebUI is used for monitoring: it allows you to find out about some issues in Cartridge and suggests how to fix them; there is a metrics package and a Grafana dashboard.
ansible-cartridge helps with deployment: it allows you to add new instances to the cluster, update existing ones, and much more.
Cartridge in production
Cartridge has proven itself in production. Our customers have been using it for at least three years. Here are some statistics on Cartridge usage:
> 100 data marts in production: cache and master storages.
From a few MB to several TB.
4 — 500 instances per cluster.
Masters and replicas in different data centers.
A single CI/CD pipeline for all applications in a closed network.
Common modules and roles in projects.
Conclusion
We discussed our approach to cluster scaling and the following topics:
Cartridge roles.
Distributed application configuration.
Fault tolerance provided by failover.
Cartridge usage statistics in production.
I hope you found this article useful, and you will try scaling with Cartridge. Please share your experience of scaling other databases in the comments.
What's next?
Try Cartridge in the sandbox.
Check out the documentation on the official website.
Ask your questions to the community in the Telegram chat.
Top comments (0)