Fernando Karchiloff

Posted on Jan 5, 2023

How to avoid messing up squash migration in Hasura

#graphql #database #hasura #migrations

Versão em Português Brasileiro

Summary

Technology versions
Background knowledge needed
Why I decided to write it
Too long; didn't read
Database example
Squash guide
Conclusion

Technology versions

Hasura v2.0.9
Hasura CLI v2.0.9
PostgreSQL 12

Background knowledge needed

Understand SQL and it's commands
Understand how migrations work
How Hasura GraphQL Engine works
How Git works

Why I decided to write it

During the development of one project at the e-commerce company I'd worked for, we had a task that would require to change the database structure, many tables and columns had to be changed to complete the task.

As Hasura GraphQL Engine offers a web page to make the structural changes at the database, I made the changes using this interface and executed the manual queries that were necessary. All of these modifications created migrations in the repository, as expected for a project with migrations enabled in Hasura.

After the modifications, it was noticeable the amount of folders with modifications that were being saved in the migration format. I decided to use the command to compress all of them into a single file, which would be more readable. That's when my mistake happened: without making basic validations beforehand, I executed the command and deleted the previous migrations. That's how I spent more than a day trying to fix a mistake that could have been avoided with a simple checklist before using the command. How can we avoid this? I will explain in the next topics.

Too long; didn't read

Make your changes using migrations.
Using Git or similar, make a commit to save your migrations.
Make sure that the up.sql and down.sql files execute as desired using hasura migrate apply and hasura migrate apply --down <N>. Do manual fixes (mistakes can happen) using the SQL tab without using migrations.
Commit your complete migrations to the version control system.
Use the command hasura migrate squash --from <num_migration> --name <name> to execute the squash, delete the old migrations (you already have them at the version control).
Mark the created migration as fulfilled at the database using hasura migrate apply --skip-execution (if the removed migrations weren't applied, don't use the flag).
Use again the command hasura migrate apply --down <N> to remove the migrations, and then use it without the flag to make sure that they're working.
Commit your changes to the version control system.
(Optional) Do optimizations if needed, for example: remove an ALTER TABLE if there's a DROP TABLE (without losing data of course).

Database example

To demonstrate how to do a squash correctly, I will use a simple database with a single table, but in a real case you can have various operations among N tables, which may lead to a larger and complex squash.

Our database contains the students table defined through this structure:

CREATE TABLE "public"."students" (
    "id" serial NOT NULL,
    "name" Text NOT NULL,
    "score" text NOT NULL,
    PRIMARY KEY ("id") 
);

It was created after we had already connected to our database created in PostgreSQL, disabled the console, and started migrations in the project. You can see more details on how to enable migrations in this section of Migrations & Metadata (CI/CD) in the documentation for the Hasura GraphQL Engine.

Let's say that this table was filled with some student data:

Somewhere along the way, we realize that the data in the score column, which cannot be text because we would like to do some arithmetic with them. However, we cannot change this data if it is being used in production! (Assume that you have thousands of important data and cannot lose them)

What to do then? In the development environment, we will create a new column to accommodate this data. We will name it score_decimal because it represents the score as a decimal number allowing arithmetic operations. Go to the students table's Modify tab and click Add new column. We will define it as a Numeric column.

Once the new column is created, we can migrate the data from the score column to score_decimal. To do this, we will use the SQL section located just below the tables in our database.

In this tab you can run SQL code directly against the selected database, marking these queries as migrations or not. Depending on the code, Hasura may be smart enough to identify if it is a structural or data change and mark it as a migration automatically, so always pay attention before running the query to see if the "This is a migration" box is enabled.

Putting the query that converts the scores in the text field, we can run it without adding it as a migration in order to verify that the query is being executed as described. After checking, we can enable the migration box and write a name for it, I used a descriptive name cast_students_score_to_decimal_migration. This is the query made to convert the data:

UPDATE students
SET score_decimal=CAST(score AS DECIMAL)
WHERE students.id = students.id;

NOTE: always be careful when making structural changes that change field categories, so do it in steps and update the view layer (front-end), if used, to change the new type without breaking the application for clients.

After running the update of the data, we can rename the columns and delete the old one. The name used for the migration was rename_columns_and_remove_old_migration.

ALTER TABLE students RENAME COLUMN "score" TO "score_text";
ALTER TABLE students RENAME COLUMN "score_decimal" TO "score";
ALTER TABLE students DROP COLUMN "score_text";

With that, we finished our modification to run our arithmetic operations, but we ended up with some separate migrations that can be joined through a squash operation, as they can be executed all together.

Squash guide

For the squash process of the migrations to occur "correctly" (there can always be an unexpected error in the middle of the process), we need to ensure that some steps are followed. I will list some of them to avoid errors in this process.

Save the made migrations in the repository
Ensure that the migrations can be made and undone without errors
Save the complete migrations
Squash the migrations
Mark the migration as executed
Save the changes
(Optional) Modify the compressed migration as desired (and save the changes)

1. Save the made migrations in the repository

To keep track of our migrations and not lose progress, we save the changes in a commit, we can then proceed to change them without the risk of losing what we have done, this was one of the mistakes I made earlier, not saving in the version control.

2. Ensure that the migrations can be made and undone without errors

With the migrations saved in the repository, we can start modifying those that have incomplete down.sql files. Many of the changes we make to the database do not always reflect in the files due to being complex changes, such as renaming columns and deleting a column in the same migration. For example, in the last migration, the down.sql file was filled out as follows:

-- Could not auto-generate a down migration.
-- Please write an appropriate down migration for the SQL below:
-- ALTER TABLE students RENAME COLUMN "score" TO "score_text";
-- ALTER TABLE students RENAME COLUMN "score_decimal" TO "score";
-- ALTER TABLE students DROP COLUMN "score_text";

For the migration to work back, that is, remove the changes, we will reverse the commands made in up.sql.

ALTER TABLE students ADD COLUMN "score_text" text;
ALTER TABLE students RENAME COLUMN "score" TO "score_decimal";
ALTER TABLE students RENAME COLUMN "score_text" TO "score";

We will do the same with the migration that creates the new column and reverse the command.

ALTER TABLE "public"."students" DROP COLUMN "score_decimal";

With that, we can use the --up and --down flags of hasura migrate apply. The command changes the database according to the .sql defined in the migrations. If you relate the files to their respective commands, it will make more sense, the command that uses --down will remove the applied migration changes, while --up will apply them (--up is applied by default if not defined). To remove only some, use:

hasura migrate apply --down <N>

In the command, N is the number of migrations that will be removed. Use hasura migrate apply --help to see more details.

Note: If you ran a migration and it failed, if you have more than one SQL command in the same file, try to fix it through the SQL tab (without creating a new one), correct it and try again.

3. Save the complete migrations

Since you have tested your migrations and they are working, save them again in a commit, now with the complete migrations.

4. Squash the migrations

With all migrations completed and working, we can squash them, that is, compress them into a single migration. To do this, we will run the command:

hasura migrate squash --from <num_migracao> --name <name>

It compress the migration files for us. In this example, I will do like this:

hasura migrate squash --from 1638852925395 --name "changing_column_type"

This command will merge all migrations starting from the prefix 1638852925395 into a single one, named "changing_column_type". I recommend you delete the previous migrations because they are already saved in the repository.

5. Mark the migration as executed

Once we have created our new compressed migration as a result of the operation, if the previous migrations were not applied to the database, we will run the command to mark it as if the migration we just generated had already been applied:

hasura migrate apply --skip-execution

Remember that this command should only be run if we actually applied the changes of the other migrations to the database, otherwise, run the command without the --skip-execution flag to apply it normally.

6. Save the changes

Once this is done, save the changes again in the repository's version control to avoid losing your changes.

7. (Optional) Modify the compressed migration as desired (and save the changes)

If your migration has many commands, evaluate the file and see if, depending on the changes made, you can simplify the executed SQL code. For example, you create a table, and then in another migration you add a column, if you have not added any data in the middle of the way, or it has not gone to production yet, directly add the column when creating the table, which will simplify the code.

Conclusion

I hope that these instructions serve as a guide so that you don't make the same mistakes I did during the project I was developing. I tried to put as many details as possible in the step-by-step instructions and the reasons behind them.

These instructions can be replicated in other projects, even if they don't follow the same technology, but if their concepts are similar, it's possible. If you don't apply them immediately, at least I hope you've learned something from what I demonstrated in this article.

Any positive criticism that can enhance this article is welcome! Share with other people who need this content!

Thank you for your attention.

Top comments (2)

Wahib Abdou • Apr 8

Thanks for this great article, really helpful... I have a question do CRUD migrations get listed in the output of the squash migration, I noticed lately that I had some insert placeholder migrations that create developers' accounts, for debugging purposes, etc but I noticed after that those migrations were not listed in the output file I guess Hasura CLI neglect the CRUD operations but I'm not sure yet

Fernando Karchiloff • Apr 9

If I'm not mistaken, the Hasura engine will comment out the operations that it cannot handle due to complex management, like CRUD operations, or ALTER TABLEs, as shown in the article. I may not have mentioned it, but you can see that it commented the code, and I manually added it again.

Sometimes this operations can be erased as you're doing a squash, because how a change was made, CRUD operations were made to allow the transition of data. When you squash it, you can remove them as they aren't needed anymore depending of what you've done.

All the migration code must be at the output, even if it's commented.

DEV Community