What is the difference between sharding over monolithic databases, like the CitusDB extension for PostgreSQL, and Distributed SQL? Both are distributing the table rows and index entries but distributing SQL means more than that: distributing SQL transactions and keeping their ACID properties.
- Sharding monolithic databases will use local transactions, with a coordinator above it that uses two-phase commit to commit on all shards and detect if there are in-doubt transactions
- Distributed SQL databases run global transactions, so that applications can see it as one logical database, to provide horizontal scalability to OLTP applications.
The most obvious difference is with ACID properties. Transactions must be atomic (A), moving the database from one state to the other, and isolated (I), so that other users see either the state before or the state after.
A typical example is a bank account transfer. If you transfer 100$ from Account A to Account B, the database transaction doing the transfer updates the two accounts one after the other to add 100$ to B an withdraw 100$ to B. The total amount of all accounts may temporarily be off by 100$ when queried by the transaction doing the transfer, but all other transactions should never see this intermediate state.
pgbench account transfer
To give an example, I'll use pgbench
on YugabyteDB. I simply initialize the default table, which includes the pgbench_accounts
table:
pgbench -i
This creates 100000 accounts with a balance of zero:
select count(*),min(abalance),max(abalance),sum(abalance)
from pgbench_accounts;
I create a custom script to do a simple transfer between two random accounts:
cat > /tmp/pgbench-acid.sql <<'SQL'
\set aid1 random(1, 100000 * :scale)
\set aid2 random(1, 100000 * :scale)
\set delta random(-5000, 5000)
BEGIN;
UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid1;
UPDATE pgbench_accounts SET abalance = abalance - :delta WHERE aid = :aid2;
END;
SQL
YugabyteDB Distributed SQL
I run this in the background for 5 minutes from 10 connections:
pgbench -f /tmp/pgbench-acid.sql -T 300 -c 10 &
Now from psql
I query the total account balance every 100 milliseconds:
select count(*),min(abalance),max(abalance),sum(abalance)
from pgbench_accounts;
\watch 0.1
Thanks to all transactions being ACID, the total amount never changes because all transfers are from one account to another:
Those are distributed transactions, with reads and writes on all nodes:
PostgreSQL with CitusDB
Let's run the same on CitusDB. I start a 3 nodes cluster:
git clone https://github.com/citusdata/docker.git citusdata-docker
cd citusdata-docker
docker-compose up -d --scale worker=3
I initialize pgbench
tables:
pgbench -i
I distribute the pgbench_accounts
table:
docker exec -it citusdata-docker_master psql -c '
select create_distributed_table('pgbench_accounts', 'aid');
'
I run the same script in the background:
pgbench -f /tmp/pgbench-acid.sql -T 300 -c 10 &
and query the total amount every 100 milliseconds:
select count(*),min(abalance),max(abalance),sum(abalance)
from pgbench_accounts;
\watch 0.1
The total amount is sometimes negative and sometimes positive because ongoing transactions are transferring random amounts. The distributed query see some intermediate states.
This is not ACID, it is Eventual Consistency. At some point, when there are no ongoing transactions anymore, the shards will be synchronized.
Note that this example was only about the Atomicity and Isolation of ACID. The Consistency is about integrity constraints and CitusDB do not allow cross-shard foreign keys or even global indexes. That is another difference with Distributed SQL (see this previous post about it). The Durability is about persistence. This CitusDB is not protected. You should add multiple standby databases, with synchronous commit, for each shard to protect it. This increases the complexity for more than a handful of shards (Patroni templates can help to get this complexity correctly designed). The YugabyteDB cluster above has replication built-in alongside the distribution and was a Replication Factor 3 cluster that can scale to more nodes transparently because the tables are actually distributed to smaller shards that can move online.
CitusDB use cases
The behavior of eventually consistent reads for multi-shard queries is expected and is acceptable for the CitusDB use cases that it is designed for (see When to Use Citus):
- Real-Time Analytics: The tables are usually sharded on time and data ingest goes to one shard. Then, the queries will be consistent.
- Multi-Tenant Application: For a few pre-defined tenants, the tables are sharded per tenant and transactions will be within one tenant, and will be consistent.
YugabyteDB use cases
Distributed SQL databases (Google Spanner, CockroachDB, TiDB, YugabyteDB) guarantee ACID properties for cross-shard transactions and are then consistent for all OLTP applications which involve multi-shard transactions, and allow automatic re-sharding when scaling-out.
Among them, the only one which is PostgreSQL-compatible when it comes to transactions is YugabyteDB as the others do not have the same features and behavior. YugabyteDB supports all PostgreSQL isolation levels, with the same behavior, for all distributed transactions.
Top comments (0)