Lucas Rivelles

Posted on Feb 7

Database Isolation Levels: A Practical Guide

#database #architecture #learning #tutorial

Picture this: you're building a bridge to some cargo across a turbulent river. Each plank meticulously placed, each bolt securely fastened. But what happens when two teams of workers try to build different sections of the bridge simultaneously, unaware of each other's progress?

Welcome to the world of database isolation levels! It's almost a common sense that we should use relational databases if we want to achieve high consistency. However, is this assumption entirely true for all use cases? Can we assume we won't have consistency problems if we just decide to use a relational database such as MySQL or PostgreSQL in our applications?

We will see that, unfortunately, life is not that simple. We'll start by explaining some concepts and see, in practice, how high consistency can be achieved.

1. Environment Setup
2. ACID Properties
3. Isolation Levels
       3.1. Read Uncommitted - The Weakest One
       3.2. Read Committed - Preventing Dirty Reads
       3.3. Repeatable Read (Snapshot Isolation)
       3.4. Serializable - The Strongest One
4. Conclusion
5. References

Environment Setup

In the examples, we'll use a Docker environment with a MySQL container and a script to fill the database with some data.

If you want to reproduce it in your local environment, the docker-compose and the SQL scripts can be found here. Follow the next steps to make it ready:
1- Clone the repository

git clone git@github.com:rivelles/isolation-level-demo.git

2- Access the project and run the container:

docker-compose up

3- Open two new terminal tabs
4- On each of them, access the container and log into MySQL:

docker exec -it mysql /bin/bash
mysql -u root -p --database=bank

5- For the password, just hit Enter

All good!

Transactions and ACID Properties

We can define a transaction as a single unit of work that can access and modify data in a database. It can be composed by multiple queries. Usually, databases that implement transactions follow a set of conventions called ACID.

A database is considered to be ACID if it accomplishes a set of properties:

Atomicity: Each transaction is treated as a single unit. Either it succeeds entirely (commit) or it fails entirely (abort).
Consistency: Constraints are guaranteed to not be violated.
Isolation: One transaction will not affect other transactions' results.
Durability: After a transaction has been committed, its changes will be stored permanently.

When speaking specifically about isolation, databases implement different levels of guarantees to control what happens when two transactions handle objects concurrently. These levels come with trade-offs between their strength in terms of consistency and performance. In the next sections, we'll dive-in each of them with some simple real-world examples to illustrate what problems they try to solve.

Isolation Levels

Concurrent transactions happen when two or more of them run without knowing about each other, making them prone to access the same objects and potentially updating them.

Existing databases have mechanisms to prevent concurrency problems to happen. The main purpose of it is to remove the responsibility of the developers to worry about concurrency when dealing with the database. They allow us to set-up different levels of consistency based on the needs we have and which problems we want to avoid.

Let's see in practice how they work.

Read Uncommitted - The Weakest One

In our application, let's imagine we have two transactions updating the balance for an account. They will read the value, and update the row with the new balance calculated by the application (let's always increment it by 1 to keep it simple). Let's see how it goes in the following scenario:

Transaction 1	Transaction 2
`SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;`
`START TRANSACTION;`
`SELECT balance FROM accounts WHERE id = 1;`
`UPDATE accounts SET balance = 1001 WHERE id = 1;`
	`SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;`
	`START TRANSACTION;`
	`SELECT balance FROM accounts WHERE id = 1;`
`ROLLBACK;`
	`UPDATE accounts SET balance = 1002 WHERE id = 1;`
	`COMMIT;`

Transaction 1 execution

mysql> SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
Query OK, 0 rows affected (0.01 sec)

mysql> START TRANSACTION;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT balance FROM accounts WHERE id = 1;
+---------+
| balance |
+---------+
|    1000 |
+---------+
1 row in set (0.01 sec)

mysql> UPDATE accounts SET balance = 1001 WHERE id = 1;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> ROLLBACK;
Query OK, 0 rows affected (0.00 sec)

Transaction 2 execution:

mysql> SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
Query OK, 0 rows affected (0.00 sec)

mysql> START TRANSACTION;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT balance FROM accounts WHERE id = 1;
+---------+
| balance |
+---------+
|    1001 |
+---------+
1 row in set (0.00 sec)

mysql> UPDATE accounts SET balance = 1002 WHERE id = 1;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> COMMIT;
Query OK, 0 rows affected (0.01 sec)

In the end, we have our balance with a total amount of 1002, although transaction 1 rolled back:

mysql> SELECT balance FROM accounts WHERE id = 1;
+---------+
| balance |
+---------+
|    1002 |
+---------+
1 row in set (0.00 sec)

This is because transaction 2 was able to see the uncommitted update made by transaction 1, this is considered a dirty read. Because of that, we end up with an inconsistent value. There is an easy way of fixing this without changing the isolation level, which is using atomic operations, so instead of changing the balance amount using a direct assignment, we can do the following:

UPDATE accounts SET balance = balance + 1 WHERE id = 1;

Notice that now we increment the balance by 1 instead of assigning a direct value on it. When we do it, the database will lock the row until the transaction finishes (either committing or rolling back). So transaction 2 wouldn't be able to update it.

Since our focus is on isolation levels, let's see the behavior of the same use case in a different setup to illustrate and understand how it works.

Read Committed - Preventing Dirty Reads

With READ COMMITTED isolation level, the database guarantees that we will only see data that was previously committed. So, if we execute the same scenario, when transaction 2 reads the balance, it would still see the initial value. Let's see how it would go:

Transaction 1 execution

mysql> SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
Query OK, 0 rows affected (0.00 sec)

mysql> START TRANSACTION;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT balance FROM accounts WHERE id = 2;
+---------+
| balance |
+---------+
|    1000 |
+---------+
1 row in set (0.00 sec)

mysql> UPDATE accounts SET balance = 1001 WHERE id = 2;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> ROLLBACK;
Query OK, 0 rows affected (0.00 sec)

Transaction 2 execution

mysql> SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
Query OK, 0 rows affected (0.00 sec)

mysql> START TRANSACTION;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT balance FROM accounts WHERE id = 2;
+---------+
| balance |
+---------+
|    1000 |
+---------+
1 row in set (0.00 sec)

mysql> UPDATE accounts SET balance = 1001 WHERE id = 2;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> COMMIT;
Query OK, 0 rows affected (0.01 sec)

Notice that, when transaction 2 reads the balance for account 1, it gets 1000, because transaction 1 hadn't committed it yet. Since it rolled back, we ended up with the correct value.

mysql> SELECT balance FROM accounts WHERE id = 2;
+---------+
| balance |
+---------+
|    1001 |
+---------+
1 row in set (0.00 sec)

Although this isolation level prevents dirty reads, there are other problems that could happen:

If transaction 1 had committed instead of rolled back, we'd have both transactions successfully committed with the final value at 1001, this phenomenon is called lost update, since the update made by transaction 1 would be overridden by transaction 2.
Imagine user 3 has two accounts and we want to display in the screen the sum of their values and also the individual value for each of them, while other transaction is handling a withdraw from one of the accounts. If the second transaction commits while the first one is getting the results, we might show an inconsistent value, let's see this in practice:

Transaction 1	Transaction 2
`SET TRANSACTION ISOLATION LEVEL READ COMMITTED;`
`START TRANSACTION;`
`SELECT id, balance FROM accounts WHERE customer_id = 30;`
	`SET TRANSACTION ISOLATION LEVEL READ COMMITTED;`
	`START TRANSACTION;`
	`UPDATE accounts SET balance = balance - 100 WHERE id = 3;`
	`COMMIT;`
`SELECT SUM(balance) FROM accounts WHERE customer_id = 30;`
`COMMIT;`

Transaction 1 execution

mysql> SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
Query OK, 0 rows affected (0.00 sec)

mysql> START TRANSACTION;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT id, balance FROM accounts WHERE customer_id = 30;
+----+---------+
| id | balance |
+----+---------+
|  3 |    1000 |
|  4 |    1000 |
+----+---------+
2 rows in set (0.00 sec)

mysql> SELECT SUM(balance) FROM accounts WHERE customer_id = 30;
+--------------+
| SUM(balance) |
+--------------+
|         1900 |
+--------------+
1 row in set (0.01 sec)

mysql> COMMIT;
Query OK, 0 rows affected (0.00 sec)

Transaction 2 execution

mysql> SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
Query OK, 0 rows affected (0.00 sec)

mysql> START TRANSACTION;
Query OK, 0 rows affected (0.00 sec)

mysql> UPDATE accounts SET balance = balance - 100 WHERE id = 3;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> COMMIT;
Query OK, 0 rows affected (0.01 sec)

In the end, we have the individual values showing each account with a balance of $1000, but the sum showing $1900. This phenomenon is called non-repeatable read, which happens when one transaction queries the same objects more than once and gets different results. Let's see how the next isolation level avoids this scenario.

Repeatable Read (Snapshot Isolation)

With REPEATABLE READ, we get another level of consistency. Basically, this level guarantees us two things:

Reads won't affect writes
Writes won't affect reads

The database achieves that by making every transaction to run pointing to a consistent snapshot of the database. In the past example, we would have avoided the non-repeatable read phenomenon, since when transaction 1 started, no matter what transaction 2 changes, it wouldn't see any update, even after transaction 2 commits its changes. Now let's see it in practice:

Transaction 1	Transaction 2
`SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;`
`START TRANSACTION;`
`SELECT id, balance FROM accounts WHERE customer_id = 40;`
	`SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;`
	`START TRANSACTION;`
	`UPDATE accounts SET balance = balance - 100 WHERE id = 5;`
	`COMMIT;`
`SELECT SUM(balance) FROM accounts WHERE customer_id = 40;`
`COMMIT;`

Transaction 1 execution

mysql> SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
Query OK, 0 rows affected (0.00 sec)

mysql> START TRANSACTION;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT id, balance FROM accounts WHERE customer_id = 40;
+----+---------+
| id | balance |
+----+---------+
|  5 |    1000 |
|  6 |    1000 |
+----+---------+
2 rows in set (0.00 sec)

mysql> SELECT SUM(balance) FROM accounts WHERE customer_id = 40;
+--------------+
| SUM(balance) |
+--------------+
|         2000 |
+--------------+
1 row in set (0.00 sec)

mysql> COMMIT;
Query OK, 0 rows affected (0.00 sec)

Transaction 2 execution

mysql> SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
Query OK, 0 rows affected (0.00 sec)

mysql> START TRANSACTION;
Query OK, 0 rows affected (0.00 sec)

mysql> UPDATE accounts SET balance = balance - 100 WHERE id = 5;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> COMMIT;
Query OK, 0 rows affected (0.00 sec)

In the end, both reads made by transaction 1 returned consistent values. When transaction 1 started, all objects it will read were committed by transactions that finished before it started so, even though transaction 2 committed the withdraw on account 5, transaction 1 would never get this change.

Although this isolation level is stronger, we could still face some problems. We saw in READ COMMITTED that we can face a lost update if one transaction overrides a previously committed change. The SQL standard does not require that REPEATABLE READ isolation level prevent this phenomenon, however, some implementation does that; it's not the case on MySQL. If we try to update the balance for an account in two different transactions, the final value will be set by the one that finished last.

There is also another concurrency problem we can face. Imagine that we have a rule stating that a customer can't have a loan bigger than the current balance. So when creating a loan we need to first check the account's balance and when withdrawing money, we also need to check the total amount of loaned money:

Transaction 1	Transaction 2
`SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;`
`START TRANSACTION;`
`SELECT balance FROM accounts WHERE id = 6;`
	`SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;`
	`START TRANSACTION;`
	`SELECT COALESCE(0, SUM(amount)) as total_amount_loans FROM loans WHERE account_id = 6;`
	`UPDATE accounts SET balance = balance - 100 WHERE id = 6;`
	`COMMIT;`
`INSERT INTO loans VALUES (1, 6, 1000);`
`COMMIT;`

Transaction 1 execution

mysql> SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
Query OK, 0 rows affected (0.00 sec)

mysql> START TRANSACTION;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT balance FROM accounts WHERE id = 6;
+---------+
| balance |
+---------+
|    1000 |
+---------+
1 row in set (0.00 sec)

mysql> INSERT INTO loans VALUES (1, 6, 1000);
Query OK, 1 row affected (0.01 sec)

mysql> COMMIT;
Query OK, 0 rows affected (0.01 sec)

Transaction 2 execution

mysql> SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
Query OK, 0 rows affected (0.00 sec)

mysql> START TRANSACTION;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT COALESCE(0, SUM(amount)) as total_amount_loans FROM loans WHERE account_id = 6;
+--------------------+
| total_amount_loans |
+--------------------+
|                  0 |
+--------------------+
1 row in set (0.00 sec)

mysql> UPDATE accounts SET balance = balance - 100 WHERE id = 6;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> COMMIT;
Query OK, 0 rows affected (0.01 sec)

In the end, we will have violated our rule, since user was able to create a loan with a value bigger than what they had in their balance. Even if we wanted to enforce the rule by adding a second check in the balance after inserting the loan, this query wouldn't be able to see the withdraw made by transaction 2.

Serializable - The Strongest One

What is the safest way of avoiding concurrency problems? Exactly, by having no concurrency. This is what SERIALIZABLE consistency level tries to achieve. It executes the transactions in a way that it seems to be executed in a serial order.

There are many ways on how this can be implemented. One of the most popular is the Two-Phase Lock (2PL), used by MySQL. It basically uses locks when transactions are accessing DB objects in a way that:

If a transaction wants to read an object, it needs to acquire a shared lock, which can be acquired by many transactions at a time.
If a transaction wants to write to an object, it needs to acquire an exclusive lock. To proceed, there can't be any other locks in that object.

Let's see what would happen in the same example:

Transaction 1	Transaction 2
`SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;`
`START TRANSACTION;`
`SELECT balance FROM accounts WHERE id = 7;`
	`SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;`
	`START TRANSACTION;`
	`SELECT COALESCE(0, SUM(amount)) as total_amount_loans FROM loans WHERE account_id = 7;`
	`UPDATE accounts SET balance = balance - 100 WHERE id = 7;`
`INSERT INTO loans VALUES (1, 7, 1000);`

Transaction 1 execution

mysql> SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
Query OK, 0 rows affected (0.00 sec)

mysql> START TRANSACTION;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT balance FROM accounts WHERE id = 7;
+---------+
| balance |
+---------+
|    1000 |
+---------+
1 row in set (0.00 sec)

mysql> INSERT INTO loans VALUES (1, 7, 1000);
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

Transaction 2 execution

mysql> SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
Query OK, 0 rows affected (0.01 sec)

mysql> START TRANSACTION;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT COALESCE(0, SUM(amount)) as total_amount_loans FROM loans WHERE account_id = 7;
+--------------------------+
|    total_amount_loans    |
+--------------------------+
|                        0 |
+--------------------------+
1 row in set (0.00 sec)

mysql> UPDATE accounts SET balance = balance - 100 WHERE id = 7;
Query OK, 1 row affected (7.19 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> COMMIT;
Query OK, 0 rows affected (0.01 sec)

In this example, when transaction 1 read the balance for account 7, it created a shared lock on that object, meaning that another transaction could read it but needed to wait if it wants to change it; on the other side, transaction 2 read the sum of the loans and also created a shared lock on the loans table. When transaction 1 tried to insert the new loan, it needs to wait because transaction 2 is holding a shared lock on loans table, and when transaction 2 tried to withdraw from account 7's account, it also needs to wait for an exclusive lock on that object, which is being hold by transaction 1. This is called a deadlock, and happens when two transactions cannot proceed because each of them is waiting for the other one to release a lock. The database detected it and killed transaction 1 automatically, allowing transaction 2 to proceed.

With this, we are still in a consistent state, because only transaction 2 completed. Because of the atomicity property, no changes made by transaction 1 were applied.

Conclusion

In summary, database isolation levels are key to maintaining data consistency and transaction reliability. From the least restrictive 'Read Uncommitted' to the strong 'Serializable', each level offers unique benefits and considerations.

There is no "correct" transaction isolation level. Every application should consider carefully the trade-offs between them, balancing performance and data integrity. The purpose of this article was to enlighten us about this subject so we are more prepared to take decisions for our applications and use-cases.

References

This article is inspired in Martin Kleppman's amazing book, Designing Data-Intensive Applications, which you can find here

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments. Some comments have been hidden by the post's author - find out more

DEV Community

Database Isolation Levels: A Practical Guide

Table of Contents

Environment Setup

Transactions and ACID Properties

Isolation Levels

Read Uncommitted - The Weakest One

Transaction 1 execution

Transaction 2 execution:

Read Committed - Preventing Dirty Reads

Transaction 1 execution

Transaction 2 execution

Transaction 1 execution

Transaction 2 execution

Repeatable Read (Snapshot Isolation)

Transaction 1 execution

Transaction 2 execution

Transaction 1 execution

Transaction 2 execution

Serializable - The Strongest One

Transaction 1 execution

Transaction 2 execution

Conclusion

References

Top comments (0)

Read next

Creating Passkey Authentication in a Rails 7 Application

Variant in Apache Doris 2.1.0: a new data type 8 times faster than JSON for semi-structured data analysis

🚀 A Comprehensive Guide to Personalizing Your GitHub Profile README

Create static directory listings on GitHub Actions/Pages