Relational databases rely on certain properties to ensure data reliability and trustworthiness. These properties, summarized by the acronym ACID, stand for Atomicity, Consistency, Isolation, and Durability. In this blog, we will delve into each of these properties using real-life scenarios and examples.
Atomicity: All or None
Atomicity ensures that a transaction is treated as a single unit, which is either fully completed or fully aborted.
Example: Consider a scenario where you transfer $100 from User A to User B:
- Initial State:
- User A: $1000
- User B: $500
If, during the transfer, the database crashes after deducting $100 from User A but before adding it to User B, atomicity ensures that the transaction is rolled back to the original state:
- Rolled Back State:
- User A: $1000
- User B: $500
Isolation: Guarding Your Transactions from Each Other
At the heart of the term "Isolation" in databases is the idea of protection. Imagine every transaction as a little bubble. Isolation ensures that these bubbles don't burst or mix into each other. By doing this, it keeps our data safe and sound.
Unraveling the Read Phenomena:
When we talk about "read phenomena," we're discussing unexpected things that can happen when transactions overlap. Let's look at each:
Dirty Reads: Imagine peeking into a room while someone is cleaning. Everything is scattered, and it's a mess. Similarly, a dirty read is like viewing data that another transaction is still changing. Since that change might be reversed, what you're seeing might not stick around.
Non-repeatable Reads: Imagine you note down the number of apples in a basket. A minute later, you count again, but the number has changed because someone took or added an apple. This is what happens here; data you read at the start of a transaction might change by the time the transaction finishes.
Phantom Reads: This is like checking a basket for apples and then finding oranges in your next check. It's unexpected! Phantom reads occur when new data appears (like those mysterious oranges) during a transaction.
Lost Updates: Think of two artists painting on the same canvas. If they paint over each other's work, one of their contributions might disappear. Similarly, when two transactions try to change the same piece of data, one of those changes might get overlooked.
Levels of Isolation - Setting the Boundaries:
Different isolation levels dictate how strictly we want to guard our transaction bubbles:
Read Uncommitted: This is like a room with an open door. You can see everything happening inside, whether it's finished or not. You see both ongoing (uncommitted) and completed (committed) changes from other transactions.
Read Committed: Now, the door is slightly closed. You only see things once they're done and settled. This level lets you view only the finished changes from other transactions.
Repeatable Read: Imagine counting the apples in that basket and then putting a protective shield around it so no one can change the count. With this level, any data you read remains consistent throughout your transaction.
Snapshot: This is like taking a photo of the room. Even if things change inside, you'll always look at the photo, keeping things consistent. You see a 'frozen' view of the data from the start of your transaction.
Serializable: Think of a narrow bridge where only one person can pass at a time. Transactions line up and wait their turn, ensuring they don't interfere with each other.
Comparison of Isolation Levels:
Consistency: Imagine It As A Fact-Checker
In the world of databases, "Consistency" is like having a meticulous fact-checker who ensures that everything matches up and there are no contradictions. It ensures that every piece of information in the database is reliable and trustworthy.
Understanding Data Consistency with a Simple Analogy:
Imagine you're flipping through a photo album. Each photo has a number written below it, indicating how many times that photo was liked on a social media platform.
Now, alongside this album, you have a diary that records who liked which photo. If a picture in the album says it has 5 likes, your diary should have 5 entries of people who liked that exact photo. It wouldn't make sense if the album said 5 likes, but the diary only listed 3 names, right? This is what we mean by data consistency - ensuring that related data matches up across different places in the database.
Data Consistency in Simple Terms:
- Photo Album (Table 1): Picture of a cat - 5 likes
- Diary (Table 2): Alice liked the cat photo. Bob liked the cat photo. Carla liked the cat photo. David liked the cat photo. Emma liked the cat photo.
If there's any mismatch between the photo album and the diary, our trusty fact-checker (Consistency) points it out and says, "Hey, something's not right here!"
Understanding Read Consistency with a Book Example:
Now, let's say you've written a book and you have the master copy (primary). To let more people read it, you create copies (replicas) of the book.
If you make a change to the master copy, like editing a paragraph, the copies should reflect that change too. If someone picks up a copy to read right after you've made an edit, they should ideally see the latest version. If not immediately, then pretty soon after. This ensures that whether someone is reading the master book or one of its copies, they're getting consistent information.
In database terms, even if there's a tiny delay (because making changes everywhere takes a little time), we should reach a point where every copy is updated. This is known as "eventual consistency".
Durability: Think of It As Saving Your Game Progress
In the world of databases, Durability is like saving your progress in a video game. Imagine you're playing a challenging level, and after hours of effort, you finally conquer it. You wouldn't want to risk losing that progress, right? You'd save your game to ensure that even if there's a power outage or your game console restarts, you won't have to play that level again. That's precisely what Durability does for databases; it ensures that once a change is made, it's saved and stays saved, no matter what.
Different Ways to Make Sure Your Game (Data) is Saved:
"Auto-Save" Feature (Write-Ahead Log): Just like some video games automatically save your progress at specific checkpoints, databases have a system where they note down changes even before fully updating everything. It's a quick note-to-self, "Remember to save this change."
Game Diary (Journaling): Some players keep a diary or journal about what they've done in a game. Similarly, databases might store changes in a separate "diary" before making them permanent.
Forced Save Option (Data File Synchronization): Ever had that moment when you manually choose to save your game because you don't trust the auto-save? In the database world, there's a way to force the system to immediately save any new data after a change.
Multiple Save Files (Replication): It's always a good idea to have more than one save file for your game. Likewise, databases often create copies of data across different locations. If one "save file" has issues, there are others to fall back on.
Cloud Backups (Backup and Restore): Just like you might backup your game progress on the cloud, databases also have a system of regularly saving data elsewhere. This way, if something goes wrong, they can "load" from a previous point and restore the game (data).
So, in short, Durability in the database world is all about making sure that once you've made progress (or a change), you won't lose it. It's about saving, re-saving, and having backups for your backups!
The ACID properties provide the backbone for database reliability. From banking to social media updates, these principles ensure that our data remains consistent, reliable, and trustworthy, even in the face of system failures. By understanding these concepts, we gain a deeper appreciation for the robustness and sophistication of modern database systems.
(Header Photo by Patrick Lindenberg on Unsplash)
Top comments (0)