@DynamicUpdate with column-level locking in YugabyteDB and how to avoid write skew anomalies

#yugabytedb #distributed #sql #postgres

I would recommend using @DynamicUpdate for Hibernate on YugabyteDB. This will help you avoid updating unnecessary rows by updating only those that have changed. It is different from PostgreSQL because updating rows in PostgreSQL will always result in the entire row being copied again, which can cause an increase in write-ahead logging, as well as table and index bloat. In contrast, YugabyteDB storage is more efficient as it only writes the updated column values and updates only the indexes on the updated columns.

YugabyteDB offers a unique feature that distinguishes it from many other SQL databases - the ability to lock individual columns, without escalating to the entire row. This feature provides higher concurrency. However, it is important to note that the Dynamic Update logic has a side effect that you should be aware of if you don't use isolation levels correctly.

Here's an example to help clarify my explanation.

I have a table where I store an amount along with its corresponding currency.

yugabyte=# create table demo (
 id bigint primary key, currency text, amount numeric
);
CREATE TABLE

yugabyte=# insert into demo values ( 1 , 'EUR' , 100 );
INSERT 0 1

There's a business ask to modify this to either EUR 104 or CHF 100 and two users, Alice and Bob, are trying to do that at the same time.

Alice sets it to EUR 104. Without dynamic update, a simple query would be: update demo set currency='EUR' and amount=104'. However, Dynamic Update sets only the columns that have a different value from what was read:


-- Alice's session

yugabyte=# begin transaction;
BEGIN

yugabyte=*# select * from demo where id=1;

 id | currency | amount
----+----------+--------
  1 | EUR      |    100
(1 row)

yugabyte=*# -- To set EUR 104, only the amount changes

yugabyte=*# update demo set amount=104 where id=1;
UPDATE 1

yugabyte=*# select * from demo where id=1;

 id | currency | amount
----+----------+--------
  1 | EUR      |    104
(1 row)

In this scenario, the value for currency was already EUR.
Dynamic Update changes only the amount to 104.

Alice's transaction is not yet committed, while Bob attempts to set CHF 100 in a similar way:


-- Bob's session

yugabyte=# begin transaction;
BEGIN

yugabyte=*# select * from demo where id=1;

 id | currency | amount
----+----------+--------
  1 | EUR      |    100
(1 row)

yugabyte=*# -- To set CHF 100, , only the currency changes

yugabyte=*# update demo set currency='CHF' where id=1;
UPDATE 1

yugabyte=*# select * from demo where id=1;
 id | currency | amount
----+----------+--------
  1 | CHF      |    100
(1 row)

Before committing, each user sees their expected result with their own update on top of the database committed state. They don't see uncommitted changes from other transactions.

The first person to commit will write the correct state to the database.
For example, if Bob commits:


-- Bob's session

yugabyte=*# select * from demo where id=1;

 id | currency | amount
----+----------+--------
  1 | CHF      |    100
(1 row)

yugabyte=*# commit;
COMMIT
yugabyte=# select * from demo where id=1;

 id | currency | amount
----+----------+--------
  1 | CHF      |    100
(1 row)

However, when Alice commits, she creates an anomaly with an incorrect state.


-- Alice's session

yugabyte=*# select * from demo where id=1;

 id | currency | amount
----+----------+--------
  1 | EUR      |    104
(1 row)

yugabyte=*# commit;
COMMIT

yugabyte=# select * from demo where id=1;

 id | currency | amount
----+----------+--------
  1 | CHF      |    104
(1 row)

The state resulted from a mix of user intents, EUR 104 and CHF 100, due to an incomplete update that did not set the columns that already had the desired value when read. Unfortunately, those values were changed by the other transaction.

This type of anomaly is a Write Skew: two transactions read a state and then make the writing decision based on that state. They write to different rows or columns, avoiding write-write conflicts, but the initial state of those rows or columns makes their decision invalid.

To avoid the Write Skew anomaly, there are two options: using isolation levels or explicit locking. You can set the Serializable isolation level to let the database detect the read-write conflicts. Alternatively, you can lock what you have read with a SELECT FOR UPDATE statement, like with LockMode.PESSIMISTIC_READ in Hibernate.

It is important to note that PostgreSQL is not affected by the issue described in this single-row example since updating a row locks the entire row, blocking other writes. However, if you perform an update on two rows, instead of two columns, you may face a similar problem in PostgreSQL. Even without @DynamicUpdate, no update statement will be executed if one row is updated with the same values as what was read before.

Let me try to put it in simpler terms. The issue that occurred was not related to Hibernate dynamic update or YugabyteDB's column-level locking. These features were added to improve the system's performance and scalability while ensuring the SQL ACID properties. The problem exposed in this blog post was a specific case of write-skew that is not visible in other databases that always lock to the entire row. To keep write-after-read transactions consistent during race conditions, YugabyteDB provides two types of locking - explicit locking (SELECT FOR UPDATE) or implicit locking (Serializable), with two types of behavior - pessimist locking (Wait-on-Conflict) or optimistic locking (Fail-on-Conflict):

Concurrency control | YugabyteDB Docs

Details about Concurrency control in YSQL

docs.yugabyte.com

DEV Community

@DynamicUpdate with column-level locking in YugabyteDB and how to avoid write skew anomalies

Concurrency control | YugabyteDB Docs

Top comments (0)

Read next

How come there are tens of thousands of tables in a database

I have optimized my work with text-to-SQL translator improved with vector search

Faster Postgres Queries with Cloudflare Hyperdrive and Neon

Basics of RAG with pgVector and Langchain