How to fix a sequence when it goes out of sync in Postgres

#technology #postgres #tutorial #programming

Postgres is one of the most advanced and widely used open-source RDBMS (Relational Database Management Systems) in the world. It's particularly appreciated by the developer community because it supports both SQL and JSON querying, making it is both relational and non-relational compliant.

Yet, Postgres has some well-known issues, and one of the most annoying ones involves sequences. Specifically, Postgres sequences are prone to go out of sync, preventing you from inserting new rows. Postgres even returns a confusing error message when this happens.

In this article, we'll teach you about the out-of-sync issue and show you how to solve it. We will learn about sequences, see what circumstances lead to the out-of-sync issue, learn how to diagnose the problem, and finally solve it. Let's begin.

Delving into the out-of-sync sequence problem in Postgres

What is a Postgres sequence?

The official documentation explains that a sequence is nothing more than a number generator. In particular, Postgres uses a sequence to generate a progressive number, which usually represents an automatically incremented numeric primary key.

If you are familiar with MySQL, the result of having a sequence in Postgres is comparable to the AUTO_INCREMENT behavior. The main difference is that the Postgres sequences can also start from a defined value and then decrement it at each INSERT.

You can define a sequence in Postgres with the CREATE SEQUENCE statement. Similarly, the special type SERIAL initializes an auto-incremental numeric primary key using a sequence behind the scene. Generally, you should use SERIAL when creating a new table with CREATE TABLE.

When do sequences go out of sync?

Based on my experience as a senior software developer, there are three reasons why a Postgres sequence can go out of sync. Specifically, this happens when:

importing many rows with an INSERT script or restoring an extensive database;
manually setting the wrong value of a sequence with the setval() function;
inserting a new record into a table by manually specifying the id field in the INSERT query.

The last one is the most common cause and usually occurs because Postgres uses a sequence and automatically updates its value only when you omit the id field or use the DEFAULT keyword in the INSERT query.

How do I identify this issue?

Spotting this issue is straightforward. When you specify no value for the id column and manually launch an INSERT query or let your ORM (Object-Relational Mapping) do it for you, Postgres will always return a "duplicate key value violates unique constraint" error. Also, this will never happen when performing UPDATE queries.

As you can imagine, the problem lies in the sequence related to the id column that went out of sync. Specifically, the error happens when the sequence returns a value for the id column that is already in use. And this leads to the aforementioned error because an ID must be unique by definition.

How to solve the problem

It is now time to see how to address the out-of-sync problem and solve it once and for all. This problem can be addressed in a few ways. Here are two approaches.

Method 1: Single table solution

Suppose you want to fix the sequence associated with the id column of your Users table. You can achieve this by running the following query:



SELECT SETVAL('public."Users_id_seq"', COALESCE(MAX(id), 1)) FROM public."Users";

Such query will update the Users_id_seq sequence by setting its current value to the result of COALESCE(MAX(id), 1). Notice how the name of the sequences in Postgres follows this notation:



"<table-name>_<column-name>_seq"

The COALESCE function returns the first non-null value, and it is required because if Users were empty, MAX(id) would return NULL. So, by using COALESCE, you are sure that the value assigned to Users_id_seq will be MAX(id) when Users is not null, and 1 when Users is null. In both cases, that query sets the desired value.

Method 2: Fixing all your sequences with one script

If you wanted to fix all your sequences with one query, you could use the following script coming from the official Postgres Wiki:



SELECT 'SELECT SETVAL(' ||
       quote_literal(quote_ident(PGT.schemaname) || '.' || quote_ident(S.relname)) ||
       ', COALESCE(MAX(' ||quote_ident(C.attname)|| '), 1) ) FROM ' ||
       quote_ident(PGT.schemaname)|| '.'||quote_ident(T.relname)|| ';'
FROM pg_class AS S,
     pg_depend AS D,
     pg_class AS T,
     pg_attribute AS C,
     pg_tables AS PGT
WHERE S.relkind = 'S'
    AND S.oid = D.objid
    AND D.refobjid = T.oid
    AND D.refobjid = C.attrelid
    AND D.refobjsubid = C.attnum
    AND T.relname = PGT.tablename
ORDER BY S.relname;

This query returns the set of queries required to fix each of your sequences when executed. As stated in the Wiki, you should use this query as follows:

Save the query in a fix_sequences.sql file.
Run the query contained in the fix_sequences.sql file and store the result in a temp file. Then, run the queries contained in the temp file. Finally, delete the temp file. You can achieve this with the following three commands:



bash
psql -Atq -f fix_sequences.sql -o temp
psql -f temp
rm temp

How to verify that the problem was fixed

Now, all you have to do to verify that your sequence is no longer out-of-sync is to insert a new record in the same table where you initially experienced the issue.

Remember to give the id column the DEFAULT value or omit it entirely in the INSERT query. This way, Postgres will use the Users_id_seq sequence behind the scene to retrieve the correct value to give to id.

For example, run the following query, and you should no longer receive the “duplicate key value violates unique constraint” error message:



INSERT INTO "Users"("id", "name", "surname", "email", "password")
VALUES (DEFAULT, 'Jennifer', 'Jones', 'jennifer.jones@email-domain.com', 'pBHxe*cWnC2ZJKHw');

Instead, this query will insert a new record in the Users table as expected.

Conclusion

Postgres is undoubtedly a great RDBMS. At the same time, it has a few issues that can waste your time. This is especially true if you aren't aware of them, don't know how to identify them, and don't address them accordingly. In this article, we looked at the tricky out-of-sync issue. Out-of-sync sequence errors are tricky because it leads to a "duplicate key value violates unique constraint" message error, which is also associated with other problems. But that won't fool you anymore, because now you know why it occurs, how to detect it, and how to fix it!