DEV Community

Cover image for Database Transactions
Abel Ayalew
Abel Ayalew

Posted on • Originally published at nabroleon.hashnode.dev

Database Transactions

This blog kicks off a series dedicated to exploring databse fundamentals, one step at a time. Stay tuned for more in this evolving series!

We often hear the abbreviation ACID usually when we talk about database transactions. By now we all know it stands for Atomicity, Consistency, Isolation, and Durability, but what are they? Even before that what is a database transaction? Why do we need transactions? What happens if we don’t do these transactions properly?

What is a Transaction?

According to Fundamentals of Database Systems, Elmasri

💡 Transaction is an executing program that forms a logical unit of database processing. A transaction includes one or more database access operations—these can include insertion, deletion, modification, or retrieval operations.

A transaction is an atomic unit of work that should either be completed in its entirety or not done at all.

Aha💡, so basically a transaction is a sequence of database queries that are treated as one unit of work.

🤔 Why do we have to treat them as one atomic unit though❓

That’s because a lot of times it's very hard to do what we want to do in a single query. We have to do a series of queries to get things done. Lots of systems use this database transaction concept such as banking, ticket reservation, online purchasing, etc.

Let’s look at a famous banking example to understand transactions better. Let’s say you want to send $1000 to your mom for the holiday(yeah you are a good kid), you pull up your phone write the amount, and hit SEND. What happens now? let’s look👇.

Database Transaction Example If the image isn't clear enough check it here

There are a couple of things we can see from the image above.

  1. Start txn and Commit txn statements that mark the transaction boundary(lifespan).

  2. Transactions are atomic, if both UPDATE operations don’t succeed, we roll back the transaction.

N.B. Transactions don’t have to be update/insert operations, we can have read-only transactions.

Let’s go through each property to understand everything better.

Atomicity

We have already established that transaction is atomic, it should be either completed to its entirety or not done at all. The atomicity property requires that we execute a transaction to completion.

🎗If a transaction fails to complete for some reason for example: disk failure, one of the queries fails, or a system crash, all the changes made with prior queries in the transaction should be undone(rollback).

In the image provided, we aimed to illustrate the importance of atomic transactions. However, consider the repercussions if an atomic transaction isn't ensured.

Imagine this scenario: your account was debited $1000, but just then, the database crashed. Inconsistency in data happens. Without transaction support, you'd be at a loss of $1000, and your mom wouldn't receive anything. Fortunately, databases do support transactions.

How would the database tackle these kinds of failures? Through a System Log.

The DBMS ensures recovery from transaction-related failures by maintaining a log. This log records all transaction operations impacting database values and additional transaction details necessary for potential recovery from failures. While this responsibility lies with the system, when developing programs, it's crucial to consider which operations should function as a single unit and construct accurate transaction logic accordingly.

Consistency

Consistency

Consistency is all about taking a database from one consistent state to another. Huh? We came back full circle. Let me explain. All that means is that if a transaction is completely executed from beginning to end without interference from other transactions, it should indeed take our database from one consistent state to another. Doesn’t this pretty much look like Atomicity? Well, that’s because to stay consistent we need Atomicity. However, Consistency guarantees that changes made within a transaction are consistent with database constraints. This includes all rules, constraints, cascades, triggers, etc. If the data gets into an illegal state, the whole transaction fails.

Let’s go back to our previous example, let’s say you were feeling very generous and you wanted to send your mom $10,000 instead of $1000, but that leaves you with -$3,100. We can’t let that happen now, can we? Balances are supposed to be positive, because of this constraint, consistency will be violated and the transaction will be rolled back.

Consistency extends beyond data integrity. it also applies to our read operations. Consider a scenario with database replicas: when fetching data from a replica, if changes occur in the main database before those updates reach the replicas, querying data during this interval may yield outdated information. This results in inconsistency within the system.

Isolation

Isolation

💡A transaction should appear as though it is being executed in isolation from other transactions, even though many transactions are executing concurrently. That is, the execution of a transaction should not be interfered with by any other transactions executing concurrently. ~ Elmasri

The other ACID properties(Atomicity, Consistency, and Durability) describe the behavior of single transactions but Isolation deals with how a transaction behaves in relation to others.

In SQL, transactions have a characteristic called isolation level.

  • Level 0 Isolation: This corresponds to READ UNCOMMITTED. It allows dirty reads (allows reading uncommitted data), making it the least restrictive isolation level.

  • Level 1 Isolation: This corresponds to READ COMMITTED. It addresses the issue of dirty reads by allowing reading only committed data, but data might change between reads(non-repeatable reads).

  • Level 2 Isolation: This corresponds to REPEATABLE READ. It means as long as you are in the same transaction your reads will be the same hence preventing non-repeatable reads but falling victim to new data appearing or existing data disappearing between reads(phantom reads), typically occurring with range queries.

  • Level 3 Isolation (True Isolation): This corresponds to SERIALIZABLE. It is the most restrictive, preventing issues like dirty reads, non-repeatable reads, phantom reads, and lost updates. However, it comes at a cost, it's the slowest. Different DBMS offer varied implementations of isolation levels, allowing us to balance performance and consistency based on our application's needs. It's a trade-off between strict isolation and transaction speed.

Isolation Level Dirty Read Nonrepeatable Read Phantom
READ UNCOMMITTED Yes Yes Yes
READ COMMITTED No Yes Yes
REPEATABLE READ No No Yes
SERIALIZABLE No No No

There is more to serializability and isolation so I suggest you check Chapter 21 on Fundamentals of Database Systems by Elmasri for a more detailed explanation.

Durability

Durability

The changes applied to the database by a committed transaction must persist in the database. These changes must not be lost because of any failure. It has to be persisted in disk which makes the process slow but some databases write to in-memory and create snapshots. An example of such a database system is REDIS.

❓ How can we ensure durability(permanence)

Well, different databases do this differently. There are different durability techniques WAL, AOF, Async snapshot, etc. However, the most common one is the Write-Ahead Log(WAL).

When using WAL, DBMS persists in a compressed version of the changes as WAL. Any changes go to disk first and then write the associated changes to the database file. When a crash happens, we can read all the log data and rebuild the state.

When a database recovers from failures, it uses different techniques, it might look for half-done transactions and roll them back using the info in the logs or it can roll forward and replay committed transactions that haven’t persisted to disk if there is enough info in the logs.

One thing that can’t be overlooked here is keeping backups regularly.

Conclusion

Alright, we've covered the basics of database transactions and the ACID principles. Think of them like the fellowship of the ring sorry, your data. We talked about how these four characteristics: Atomicity, Consistency, Isolation, and Durability, keep everything in check. From making sure transactions happen as a whole to keeping data consistent. While we often don't see it, the database handles a lot of heavy lifting for us. But hey, when it comes to writing our app logic, it's on us to be smart about it.

💬 Remember, what we covered here is just the tip of the iceberg. There's a whole world of database intricacies out there. If you wanna learn more about database transactions or databases in general, check the resources below.

Resources

Top comments (7)

Collapse
 
amitrastogi2202 profile image
Amit Rastogi

Nicely explained....Thanks for the writeup!
From database user perspective its key to understand the isolation levels provided by your database and tune it as per your application's consistency vs. performance requirements.

Collapse
 
morphzg profile image
MorphZG

Thanks for sharing. I always appreciate when i see the effort behind the the blog post. Well done!

Collapse
 
nabroleonx profile image
Abel Ayalew

Thanks a lot, check out another one I wrote on how databases store data and indexes Link

Collapse
 
fridaycandours profile image
Friday candour

I super like this post. Very informative. I wish I saw when I first learning databases.

Collapse
 
nabroleonx profile image
Abel Ayalew

It means a lot, I will be posting more content like this

Collapse
 
sc0v0ne profile image
sc0v0ne

Very good post !!!, you explorationed good points.

Collapse
 
nabroleonx profile image
Abel Ayalew

Thanks :)