ChunTing Wu

Posted on May 9, 2022

Solve Phantom Read in MySQL

#database #architecture #tutorial #mysql

The combination of MySQL and its storage engine InnoDB is almost the most widely used relational database nowadays, and Repeatable Read is the most common in the isolation level.

However, compared to PostgreSQL, InnoDB has several problems that cannot be solved elegantly at the Repeatable Read level.

Lost updates
Phantom read

Lost updates in PostgreSQL can be completely solved without additional hacks. As for phantom reads, there are some small tricks that can be used, such as range types and other mechanisms.

Nevertheless, MySQL still has to be careful to identify the pitfalls and deal with them properly by developers to solve such problems. In my previous article, we introduced three ways to address lost updates. Those approaches provide a more flexible solution to lost updates and are suitable for a variety of scenarios.

In this article, we will further explore how to properly solve the write skew caused by phantom reads.

There are many types of scenes that result in phantom reads, but in general, they all have the following pattern.

Search a specific range.
Do something according to the results of the range (Create, Update, Delete).
The operation will directly affect the original range results.

Suppose it is only an update or a delete, the most straightforward way to avoid write skew is to use an exclusive lock. If you use FOR UPDATE at the beginning of SELECT, then two concurrent transactions will be forced to go one after the other, thus effectively avoiding the write skew in the race condition.

However, in the case of a create, the solution is not so intuitive. Because there is no corresponding row to lock in SELECT, the row is created later. So how to solve it?

Meeting Room Booking System

Before introducing the solution, let's use a practical example to describe the problem caused by phantom read.

There is a meeting room system that provides users to reserve a meeting room, and when the user has successfully reserved the meeting room, a new corresponding data will be added in the table as follows.

user	room_id	start_time	end_time
A	123	2022-05-01 10:00	2022-05-01 11:00

The above table records that user A reserved the meeting room 123 for one hour on 5/1 at 10 am.

The behavior of this system will be similar to the following pseudo code.

count = `SELECT COUNT(*) FROM booking 
         WHERE room_id = 123 AND 
         start_time < '2022-05-01 11:00' AND 
         end_time > '2022-05-01 10:00'`

if count == 0:
    `INSERT INTO booking (user, room_id, start_time, end_time)
     VALUES ('A', 123, '2022-05-01 10:00', '2022-05-01 11:00')`

When the user is sure that the meeting room is unoccupied for the corresponding time slot, then the user can insert an entry as a reservation and the next user will not have a time conflict. Doesn't that seem nice?

The problem occurs when two users want to occupy the same time slot in the same meeting room simultaneously, and they can both pass the first SELECT validation, so they can both insert a reservation, and a conflict occurs. And such a situation can not be solved by adding a lock, because there is no row to lock at the beginning.

Solve by Uniqueness (Incomplete Solution)

Since there is no way to turn a simultaneous operation into a sequential operation through a lock, we let one of them simply fail. To do so we need to add some constraints, e.g. unique constraints, to the table.

One approach is to create a unique constraint index on the room_id, start_time columns, so that the second person trying to reserve the same time slot will fail.

The problem is solved if we restrict the use of each room to a maximum of one hour.

But if the meeting room can be booked for more than an hour, another problem arises.

User A is reserved for 5/1 from 10am to 12pm
User B is reserved for 5/1 from 11:00 to 12:00

When both User A and B are operating at the same time, this unique constraint obviously cannot be effective, and then the conflict around the meeting room remains.

Materialize Conflicts (Correct Solution)

To solve such phantom reads, the developer must use some tricks to reveal conflicts hidden under the same table.

One way is to create a new table and pre-fill it with data to act as a coordinator for simultaneous operations. In the case of this meeting room system, we can create a table time_slots that lists all time slots in advance as follows.

room_id	time_slot
123	2022-05-01 08:00 ~ 09:00
123	2022-05-01 09:00 ~ 10:00
123	2022-05-01 10:00 ~ 11:00
123	2022-05-01 11:00 ~ 12:00

When the meeting room is to be reserved, we not only execute SELECT on the original booking, but also SELECT on time_slots, and we can add FOR UPDATE because the data already exists. It is worth noting that the new SELECT FOR UPDATE is executed before the original SELECT.

In that case, when the expected time slots of two simultaneous users overlap, they will be blocked by the exclusive lock and become one after the other, and the latter will fail directly because it sees the result of the previous completion.

Conclusion

I have to say such a solution is difficult and not intuitive. However, in order not to sacrifice performance when using MySQL, the isolation level is not configured to be Serializable, which means complexity must be traded off for performance during execution.

It is a trade-off between complexity and performance. In fact, using FOR UPDATE to process synchronization in such a scenario does affect performance, and if booking is a table that may have phantom reads in all contexts, then making booking individually Serializable is a feasible solution.

When using a database, we must know the capabilities of the database and understand all the unsolvable situations of the database, so that we can know what kind of behaviors are potential risks when designing and developing.

In addition, how to properly address the risks is also an important topic. Although the use cases are not exactly the same for everyone, the patterns are similar, and learning how to solve each pattern will help you to deal with similar situations quickly in the future.

This article provides a solution for write skew when "creating" data, while the previous article is about solving write skew when "updating" data. These should cover most of the situations that you might encounter. If anyone has encountered other kinds of MySQL race conditions, please feel free to discuss them with me as well.

Top comments (2)

Vincent Milum Jr • May 10 '22

Really interesting concept on this one. Curious why not just go for the simpler approach? You already have a parent table with the list of rooms themselves, correct? Why not create a lock on the room's row on the parent table, do the serialized work on the child table, and then release the lock on the row on the parent table. This is a much simpler and more elegant solution, and pretty much solves all the problems discussed here.

ChunTing Wu • May 10 '22 • Edited

In the original approach, there is no parent table. The only one table, booking, without the corresponding data for the reservation. Thus, we cannot lock for nothing.

For this practical example, the room list might be stored at the MongoDB or any other data storage. After all, the room list is not the core of this system >.^