DEV Community

Nick Benoit
Nick Benoit

Posted on

3 Things I wish I knew about databases two years ago

Transactions

A database transaction is a way to logically group a series of database interactions into one functional unit. This unit will either get applied to the database, or it will get removed.

Lets look at an example of how transactions make our application more resilient. In this example, we run a grocery store rewards app. All members with 100 points automatically get sent a gift card, then get their points reset to 0.

So we do a query to find all grocery store rewards members with over 100 rewards points so we can automatically send them a thank you gift card. After we send the gift card, we should zero out the rewards points.

SELECT accounts.id, accounts.email FROM accounts JOIN rewards ON accounts.id=rewards.id WHERE rewards.points > 100;

Enter fullscreen mode Exit fullscreen mode

Next our app creates some async jobs to send the gift card emails.

Finally, we update the database to reset the points of all the users.

UPDATE rewards SET rewards.points=0 WHERE rewards.points > 100;
Enter fullscreen mode Exit fullscreen mode

So what could have gone wrong here? In my demo app it probably worked, right?

The problem case we could have seen here is called a 'dirty read'. Basically, we made a query on the rewards table to find all rows with points greater than 100. Then later, we did the same query again, and we expected it to be the same data.

If someone had checked out at the store, and gone from 99 -> 101 rewards points after we created the gift card jobs, but before we reset the points to zero, their points would have been zeroed out, but without receiving a giftcard.

Wrapping all of these interactions in transaction would have prevented this kind of behavior. The database would have guaranteed that the WHERE clause would have returned the same results on both reads.

Foreign Key Constraints

Relational databases are all about "relations", which in simple terms is relating rows in different database tables. We do this by doing a JOIN on some kind of key. Looking back at the last example, we did this JOIN rewards ON accounts.id=rewards.id.

A foreign key constraint is a database rule that makes sure that when you create a row in the rewards table, the id column must be the id of a valid row in the accounts table. This is great because it just gives us a bit more protection against accidentally filling up our database with bad data.

Row locking

When you think about concurrent writes to your database, I hope the hair stands up on the back of your neck just a bit. There are lots of things that can go wrong here, ranging from the subtle to the catastrophic.

Modern databases do a very good job of handling these various cases, but getting it right from an application perspective requires quite a bit of attention is being paid. Some databases like SQLite will not even allow concurrent writes from separate connections by default.

The particular case I want to call out here is where an update to your database depends on a previous read. If two threads are trying to do something like this, they have potential to wipe out each others updates.

Lets look at an example from our rewards program app.

# User 1 has 15 rewards points
Connection 1: Reads current rewards of 15
Connection 2: Reads current rewards of 15
Connection 1: Calculates new values should be 17, and updates the database
Connection 2:  Calculates new values should be 16, and updates the database

# Final value: User 1 has 16 rewards points
Enter fullscreen mode Exit fullscreen mode

We can solve this problem by first, ensuring that this dependent read and write are within a single transaction, second, ensuring we have our database configured to use the required transaction isolation level, and three, using a FOR UPDATE clause with our SELECT statement.

Which might look like this:

SELECT id, points, rewards_id FROM rewards WHERE rewards_id=:rewards_id FOR UPDATE;
Enter fullscreen mode Exit fullscreen mode

What the above does is maintain an exclusive lock on the row we have read from just like we have already made our update call. This prevents other connections from reading temporarily until we have either committed or rolled back our transaction. Now we can ensure that even if we do have two concurrent writers, they will not wipe out each others writes.

Conclusion

The moral of the story here is that modern databases can do a lot of neat stuff. It is worth putting in some time to learn some of those features, especially those that help prevent you from shooting yourself in the foot.

Top comments (0)