Exein

Posted on May 2

A tale of TimescaleDB, SQLx and testing in Rust

#security #rust #sql #testing

At Exein, we use TimescaleDB to store all kinds data throughout our services and rely on SQLx to interface with it. Throughout this article, we’ll explore how we recently changed our testing approach in relation to the database along with the problems we encountered and the solutions we came up with.

What is TimescaleDB?
Timescale is a database platform engineered to deliver speed and scale to resource-intensive workloads, which makes it great for things like time series, event, and analytics data. Timescale is built on PostgreSQL, so you have access to the entire PostgreSQL ecosystem, with a user-friendly interface that simplifies database deployment and management.

At the core of TimescaleDB sit hypertables, which are tables that automatically partition data into chunks by time along and on which policies such as chunk compression and retention can be set.

Another interesting feature of TimescaleDB is its continuous aggregates, which can be seen as a specialization of PostgreSQL materialized views, but for hypertables. Unlike materialized views, instead of refreshing the entire dataset every time, continuous aggregates store the aggregate of time based data and update according to their refresh policy or, in the case of real-time continuous aggregates, additionally aggregating recent, non-materialized data, to provide up-to-date results.

SQLx
SQLx is a popular Rust database framework which, at the time of the article, comes bundled with drivers for SQLite, MySQL and PostgreSQL. Since TimescaleDB is based on PostgreSQL, using the sqlx-postgres driver allows us to seamlessly interact with the database.

Testing
In both Continuous Integration (CI) and local development environments, it’s often necessary to test parts of our system using a live database. This process can become complicated when tests are executed concurrently or in parallel, as it may lead to collisions between the database objects being accessed, such as tables that are being read from or written to.

To avoid these issues we would run tests sequentially against a single live database, wrapping them in setup and teardownfunctions which allowed us to prevent the state from one test from spilling over to the next. While functionally sound, it takes significantly longer to run the entire test suite.

SQLx provides a neat testing mechanism through the sqlx::testprocedural macro. The macro provides a harness that runs each test on its own database, thus removing collisions between objects since each test is now isolated. It can even run migrations and even apply fixtures to get the database in the desired state before the test starts running.

Just like the compile-time query checks of SQLx, the test harness connects to the database through the same DATABASE_URL environment variable. This is generally fine, but we use some custom code to create the database connection pool and wanted to reuse that for testing as well, for consistency reasons.

Wrapper proc macros
Since sqlx::test does mostly what we want, I couldn’t help but wonder whether we can slightly adjust the way it works and still leverage it for our testing.

Rust’s attribute procedural macros have the following signature:

`#[proc_macro_attribute]
pub fn dummy(args: TokenStream, input: TokenStream) -> TokenStream {}`

People that have worked with Rust’s procedural macros might know that parsing and processing their arguments is quite involved, and replicating what sqlx::test does just to slightly change some inner workings would take a lot of boilerplate code and then effort to maintain it.

When thinking about procedural macros what comes to mind are annotations that take some code and either produce new code or alter the annotated code, or both. But what if among the modifications we make to the annotated code we could involve new procedural macros annotations? I’ll refer to these as wrapper proc macros.

While I haven’t realized it before, I actually saw quite a few examples of this out in the wild. One example that comes to mind is in serde derive macros which adds various attributes to the code it generates. The only questions remaining then is how to do this correctly?

Some information on this can be found on the rust-lang forum, specifically this post and the issues listed within it. The bottom line is that while there’s no formal documentation on this yet, work has been done towards the general intent of having macro attributes expand in an outside-in fashion, meaning that top attributes can see and even add/remove/modify bottom attributes.

The plan
Given all the information listed above, we can now envision a plan:

Write a wrapper proc macro that adds the sqlx::test annotation while also passing in all its arguments to it;
Add our custom database connection code to the generated code.

The first step of writing a wrapper proc macro is relatively straightforward if you have worked with proc macros before.

#[proc_macro_attribute]
pub fn test(args: TokenStream, input: TokenStream) -> TokenStream {    
        // We only convert this to `proc_macro2::TokenStream` to allow using    
        // it with the `quote!` macro.    
        let args = parse_macro_input!(args as proc_macro2::TokenStream);    
        let input = parse_macro_input!(input as ItemFn);    
        let ret = &input.sig.output;    
        let name = &input.sig.ident;    
        let inputs = &input.sig.inputs;    
        let body = &input.block;    
        let attrs = &input.attrs;  

        quote! {        
                #(#attrs)*        
                #[::sqlx_test(#args)]        
                async fn #name(#inputs) #ret {            
                    #body
            }    
      }    
      .into()
}

All that is required is that we destructure the input function so that we add the sqlx::test attribute at the end. Since this is a procedural macro, it will have to reside in a different crate in the workspace, say db_macros. However, this doesn’t cover the initial goal we set off to achieve, which is re-using our custom connection pool setup code in tests. But it does set the ground for that.

The next step is figuring out how to alter sqlx::test’s usage of the DATABASE_URL variable to make it work with our custom code. Perhaps expanding a sqlx::test annotated test could help with this.

Say we have this test function as an integration test in the db_internal crate:

#[sqlx::test(    
        migrator = "db_internal::MIGRATOR",    
        fixtures(        
                path = "../fixtures",        
                scripts("some_script.sql", "some_other_script.sql")    
        )
)]
async fn dummy_test(pool: PgPool) {    
        // do stuff with `pool`
}

This would expand to:

#[::core::prelude::v1::test]
fn dummy_test() {
    async fn dummy_test(pool: PgPool) {
        {
            {
                // do stuff with `pool`
            }
        }
    }
    let mut args = ::sqlx::testing::TestArgs::new(concat!(module_path!(), "::", stringify!(dummy_test)));
    args.migrator(&db_internal::MIGRATOR);
    args.fixtures(&[
        ::sqlx::testing::TestFixture {
            path: "../fixtures/some_script.sql",
            contents: include_str!("../fixtures/some_script.sql"),
        },
        ::sqlx::testing::TestFixture {
            path: "../fixtures/some_other_script.sql",
            contents: include_str!("../fixtures/some_other_script.sql"),
        },
    ]);
    let f: fn(_) -> _ = dummy;
    ::sqlx::testing::TestFn::run_test(f, args)
}

We can see that some TestArgs structure is constructed, containing migrations (or the provided migrator in our case) and fixtures. This is based on those args to sqlx::test that we really want to avoid having to parse and process ourselves.

For our mission, the TestFn trait is really the interesting part. Per the sqlx::test docs, the macro can be used on async functions accepting a connection pool, a single connection or connection options and the trait implementation for each of these function signatures drives the test setup.

Since sqlx::test relies on DATABASE_URL for tests we can try a naive implementation where we construct and set the environment variable based on our custom connection code before the test is run by injecting a line before the TestFntrait method call.

This raises another question, if we use our test wrapper proc macro to add the sqlx::test annotation, how are we going to manipulate the code expanded from the sqlx::test macro? Using other, “private”, procedural macros. We will again depend on the outside-in nature of proc macro expansion and leverage a second macro that we will add after sqlx::testso it can see the expanded code and manipulate it.

That macro will look something like this:

#[proc_macro_attribute]
pub fn __private_test(_args: TokenStream, input: TokenStream) -> TokenStream {
    let mut input = parse_macro_input!(input as ItemFn);

    // Pop the last statement
    let last = input
        .block
        .stmts
        .pop()
        .expect("sqlx::test always generates statements");

    // Generate custom statement
    let stmt = quote!{
        // custom connection code statement
    }.into();

    // Insert statement at the end of the block
    input
        .block
        .stmts
        .push(parse_macro_input!(stmt as Stmt));

    // Add the last statement back
    input.block.stmts.push(last);

    quote!(#input).into()
}

And so the test wrapper proc macro will now look something like this:

#[proc_macro_attribute]
pub fn test(args: TokenStream, input: TokenStream) -> TokenStream {
    let args = parse_macro_input!(args as proc_macro2::TokenStream);
    let input = parse_macro_input!(input as ItemFn);
    let ret = &input.sig.output;
    let name = &input.sig.ident;
    let inputs = &input.sig.inputs;
    let body = &input.block;
    let attrs = &input.attrs;

    quote! {
        #(#attrs)*
        #[::sqlx_test(#args)]
        #[::db_macros::__private_test]
        async fn #name(#inputs) #ret {
            #body
        }
    }
    .into()
}

Note the usage of the db_macros crate, which is the crate that houses our custom procedural macros.

Trying it out
With the wrapper proc macro in place, it’s time to try it out. Naturally, the local environment is the place to start. We spin up a Docker container of TimescaleDB and try to run our dummy test against it. The harness connects to the database and runs fixtures & migrations before the test function body, so it doesn’t matter that the test itself is empty.

It all seems to work, great! Time to give it a go in the GitHub CI.

failures:

---- dummy_test stdout ----
thread 'dummy_test' panicked at /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/sqlx-core-0.7.4/src/testing/mod.rs:248:14:
failed to apply migrations: Execute(Database(PgDatabaseError { severity: Error, code: "40P01", message: "deadlock detected", detail: Some("Process 104 waits for AccessExclusiveLock on advisory lock [21387,1005,0,29749]; blocked by process 188.\\nProcess 188 waits for ShareLock on relation 15242 of database 21387; blocked by process 104."), hint: Some("See server log for query details."), position: None, where: None, schema: None, table: None, column: None, data_type: None, constraint: None, file: Some("deadlock.c"), line: Some(1148), routine: Some("DeadLockReport") }))
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Uh-oh… A deadlock? From what? Trying it numerous times in both the CI and local environment always gave the same output: locally tests run fine but deadlocks occur almost every time in the CI.

Next plan: no plan
I have to admit that I was taken off guard by this. I used the sqlx::test harness with PostgreSQL numerous times without issues and the little trick we pulled off with the wrapper proc macro merely sets the DATABASE_URL environment variable at runtime so common sense heavily prevents me from considering this as the cause. I started exploring whether the issue was simple, silly and easily fixable.

Speculations and observations: Part 1
It immediately felt like this is a TimescaleDB related issue. But how different is TimescaleDB from vanilla PostgreSQL and could these differences really throw things off so much?

The answer lies in how all the neat features of TimescaleDB work: background workers. With each hypertable, continuous aggregate and policy on these objects TimescaleDB makes use of background jobs to do the necessary processing.

I strongly suspected that the deadlocks occurred between TimescaleDB background workers and the migration scripts. Out of these two, we can’t really directly alter the background workers behavior, but maybe our migrations have some incompatibility with how TimescaleDB works.

How SQLx migrations work
Without going into too much detail, analyzing the code between sqlx::test shows that migration scripts get run one by one, each in its own individual transaction. A deadlock happens between transactions, so it seemed fair to assume that the issue is between one or more migration steps and the background workers.

Getting database logs
The interesting part of the testing step in our GitHub CI goes something like this:

- name: Setup database
  run: docker compose up -d timescaledb

- name: Run tests
  uses: actions-rs/cargo@v1
  with:
    command: test
    args: --locked --workspace

Instead of getting the database logs post-mortem, I thought it would help to be able to get them closer to the events that generated them, so I combined these two steps into:

- name: Run tests
  run: |
      docker compose up timescaledb &
      cargo test --locked --workspace

Running the CI now provides the database logs along the running tests, and some interesting bits started showing up:

2024-04-01 08:10:19.415 UTC [443] ERROR:  deadlock detected
2024-04-01 08:10:19.415 UTC [443] DETAIL:  Process 443 waits for ShareLock on relation 16529 of database 24363; blocked by process 409.
        Process 409 waits for AccessExclusiveLock on advisory lock [24363,1001,0,29749]; blocked by process 443.
        Process 443: CALL _timescaledb_internal.policy_refresh_continuous_aggregate()
        Process 409: [ Migration step SQL here ]

At this point tests were still running and none failed yet. Then:

2024-04-01 08:22:53.991 UTC [104] ERROR:  deadlock detected
2024-04-01 08:22:53.991 UTC [104] DETAIL:  Process 104 waits for AccessExclusiveLock on advisory lock [21387,1005,0,29749]; blocked by process 188.
        Process 188 waits for ShareLock on relation 15242 of database 21387; blocked by process 104.
        Process 104: [ Migration step SQL here ]
        Process 188: CALL _timescaledb_internal.policy_refresh_continuous_aggregate()

This is when the first test failure could be noticed. Shortly after, when all the remaining tests either finished or were skipped, the entire CI job failed.

Speculations and observations: Part 2

The logs shed more light on what is happening. In the first log snippet, we see a deadlock taking place between a migration step and a background worker. However, this does not result in the test failure. The second log snippet shows the same thing, but the processes are reversed. This does result in a test failure.

PostgreSQL periodically checks for deadlocks and if it finds any it will stop one of the processes involved to allow the other one to continue. It looks like if the background worker gets killed, the migration step continues and the test moves on. But if the migration step process is killed (unlocking the background worker in the process) the test fails with a migration error.

Up until now all the failures happened at a given migration step. The step served to change the datatype of some columns in a hypertable (let’s use the table name data), including the column used as the time series. This requires re-creating the data table, so the step does something along the lines of:
1) Rename the data table to data_old;
2) Create a new table named data;
3) Move data from data_old to data;
4) Make data a hypertable;
5) Create indices, triggers and various TimescaleDB policies;
6) Drop some obsolete continuous aggregates based on the data_oldtable;
7) Drop the data_old.

Was there an underlying issue to how we did migrations, combining vanilla PostgreSQL and TimescaleDB specific actions in some of the bigger steps, like the one above? The logs provide a SQL snippet containing the code related to dropping those obsolete continuous aggregates. It seemed like a long-shot, but perhaps if we somehow dropped these before this step we would observe a different behavior?

Intermediary migrations

Modifying existing migrations is a no-go. SQLx keeps track of the migrations that have already run and does some integrity checks to ensure they have not changed. But could we maybe add a new step somewhere in the middle? It seems like we can!

Given a migrations folder such as:

migrations/
    20231129000000_first_step.sql
    20231202000000_second_step.sql
    20240105000000_third_step.sql

We can add a migration step in the middle by giving it a timestamp that is situated between steps.

migrations/
    20231129000000_first_step.sql
    20231202000000_second_step.sql
    20231202000001_intermediary_step.sql
    20240105000000_third_step.sql

The caveat here is how this behaves on databases that already went through the previous migrations vs test databases that start from scratch. For the former, the intermediary step actually gets run at the end, because SQLx knows that it already ran all the other migrations. For the latter, migrations will be ran in order. This means that our intermediary migration step has to be idempotent regardless of whether in runs in or out of order.

So with this in mind I added such an intermediary step that removed the obsolete continuous aggregates if they existed to see whether that impacts the deadlock occurrence. Since these objects were obsolete we have the luxury of dropping them if they exist at any point in the migration process.

The result was that deadlocks still occurred, but their frequency was reduced. If they previously happened in 80% of the runs, they would now happen in about 60%. So something seemed to have changed.

Circling back to TimescaleDB

At this point our CI test suite was running quite slowly, primarily due to the limited resources available in our GitHub runners. We were using the timescale/timescaledb-ha:pg15-latest Docker image, so I figured I could take another shot in the dark and see if there’s any difference with the latest image, timescale/timescaledb-ha:pg16. Newer versions come with bug fixes and performance increases so it was worth trying, if only for comparison’s sake.

The good news is that this further decreased the deadlock frequency to around 40%. The bad news is that the CI was still inconsistent. Furthermore, migrations would fail at other steps too (including the intermediary migration step I added above), so this rules out the issue being related to a single migration script. This in itself conveys some information; it points to a broader issue, unrelated to singular migration steps but rather to a broader issue when applying migrations containing TimescaleDB specific syntax.

Speculations and observations: Part 3

It began obvious that this was not going anywhere and I needed to shift approaches. Waiting for the CI to run was slow and time-consuming and I had too little control and access to properly understand what’s going on.

Constrained environment

Thinking about the slowness of the CI and resource contraints I thought I could simply limit the resources of a TimescaleDB Docker container and see whether I can replicate the issue locally.

Eventually, after limiting the CPU usage of the container further and further, I started getting the issue locally too when running TimescaleDB like this:

docker run --rm --name ts-container --env POSTGRES_USER=postgres --env POSTGRES_PASSWORD=postgres --memory=512M --cpus="0.125" timescale/timescaledb-ha:pg16

So when resources are low, both migrations and background workers take too long to finish and concurrency issues start occurring in the form of deadlocks. I started looking through TimescaleDB docs and forums and stumbled unpon something interesting: it looks like it was primarily designed to work on a single database per instance.

Documentation is sparse, but these are a few references:

Isolation: Database vs Schema

Since sqlx::test creates a new database for every test, performance seems to be affected due to the amount of TimescaleDB background jobs that get started. This makes the overall database server slower and more prone to deadlocks. The alternative presented in the timescale forums is isolating data through schemas rather than databases.

When isolating tests through separate databases, TimescaleDB manages state, settings and configuration through various internal schemas. Background jobs dedicated to one database get created so when creating multiple databases a set of background jobs and internal schemas get created for each of them.

If we were to isolate tests through schemas instead, we could increase performance because all the TimescaleDB related objects and background processes would be shared across all the tests, lowering the number of concurrent background workers.

Further down the rabbit hole

At this point it became clear that I’d have to either revert to the old sequential testing approach or get my hands dirty with some SQLx internals. Being the relentless individual that I am, I decided to give the latter a go. The goal still has not changed - avoiding boilerplate just to replicate the sqlx::test proc macro. It just looks like we will have to use more custom code than initially anticipated.

Schema based isolation

Fortunately for me, I was already somewhat familiar with SQLx’s internals due to the Exasol driver I wrote as a third party SQLx driver.

For PostgreSQL, the most relevent part of the code is here. With this in mind I changed some things around to rely on schemas instead of databases and even simplified some parts of the implementation as this was always meant to be for internal use only..

Additionally, a custom implementation of the TestFn trait was required and used in the __private_test proc macro. The custom TestFn impl looks something like this:

#[derive(Debug)]
pub struct CustomTestFn<Fut>(pub fn(PgPool) -> Fut)
where
    Fut: Future;

impl<Fut> TestFn for CustomTestFn<Fut>
where
    Fut: Future,
{
    type Output = Fut::Output;

    fn run_test(self, args: TestArgs) -> Self::Output {
        sqlx::test_block_on(async move {
            let (schema, pool) = setup(&args).await.expect("test setup failed");

            let res = (self.0)(pool.clone()).await;

            teardown(&args, schema, &pool)
                .await
                .expect("test teardown failed");

            res
        })
    }
}

This makes the __private_test proc macro look something like this:

#[proc_macro_attribute]
pub fn __private_test_fn(_args: TokenStream, input: TokenStream) -> TokenStream {
    let mut input = parse_macro_input!(input as ItemFn);

    let last = input
        .block
        .stmts
        .pop()
        .expect("sqlx::test always generates statements");

    // Generate and store a new statement to override the local variable
    let test_fn_stmt = quote!(let f = ::db_internal::CustomTestFn(f);).into();
    input
        .block
        .stmts
        .push(parse_macro_input!(test_fn_stmt as Stmt));

    // Add the last statement back
    input.block.stmts.push(last);

    quote!(#input).into()
}

Unfortunately, even with this change in isolation, deadlocks frequency would still be around 25%. It’s much lower compared to what we started off with but still unnacceptably high for an automated process.

Treating the symptoms

At this point it’s apparent that we are focusing on the symptoms rather than the cause. We know this is a concurrency issue and it’s happening between TimescaleDB background jobs and some of the migration scripts. We’re missing something. The small and silly kind.

I kept running test suites locally, looking through logs, trying to find some a new lead. And then I saw it. When it came to deadlock errors, they all had something in common because of schema based isolation and because I ran the test suite locally multiple times on the same Docker container. This meant that all tests would use the same TimescaleDB internal objects and that these objects would not change since the container was the same.

So let’s take another look at a log snippet:

2024-04-01 08:22:53.991 UTC [104] ERROR:  deadlock detected
2024-04-01 08:22:53.991 UTC [104] DETAIL:  Process 104 waits for AccessExclusiveLock on advisory lock [20839,1005,0,29749]; blocked by process 188.
        Process 188 waits for ShareLock on relation 17261 of database 20839; blocked by process 104.
        Process 104: [ Migration step SQL here ]
        Process 188: CALL _timescaledb_internal.policy_refresh_continuous_aggregate()

What is relation 17261? I’m no PostgreSQL guru so finding a way to figure this out required a short search. As it turns out, this actually is very easy to find out:

SELECT 17261::regclass;

Which outputs:

regclass

_timescaledb_config.bgw_job

This is always one of the objects being contended, locked by a transaction that then waits on something else and waited by a second transaction that locked the something else the first one wants!

Mitigating the cause

I could not find a lot of information about this table specifically, but right away I could tell it’s internal to TimescaleDB. Querying it returns a list of jobs that exist for the database. This led to the following revelation:

Background jobs run periodically, so background workers access this table to check whether or when they should run which may involve updating the table;
At the same time, migration scripts issue all kinds of TimescaleDB specific statements and some of them definitely interact with this table due to their nature (i.e., adding, removing or modifying background jobs).

A less abstract scenario may look as follows.

A background worker could update the _timescaledb_config.bgw_job in a transaction and then want to do something with the data table;
At the same time, a migration step would alter the data table, and also re-create some background jobs within the same transaction (which implies updating the _timescaledb_config.bgw_job table).

When running in a constrained environment and things move slow, neither of these transactions finish in due time and end up running together, locking each other. The scenario above is just an example, but the idea is always the same: deadlocks between two transactions involve two objects. To avoid deadlocks, one has to ensure that there’s never contention on both objects at the same time.

In our case, _timescaledb_config.bgw_job is always one of the objects. The other object seems to vary, because there are multiple migration steps that use TimescaleDB related syntax and it only seems to be a matter of processing power where and if a deadlock would occur.

In conclusion, a permanent solution would be restricting access to the background jobs table, this way allowing migrations to run unbothered. As a result, background workers can then also run as usual without issues since migrations are over.

This was incorporated as part of the custom impl of TestFn, with the relevant bit looking like this:

// Start a transaction to lock various TimescaleDB specific resources to 
// prevent deadlocks between migrations and TimescaleDB background workers.
let mut tx = conn.begin().await?;
tx.execute("LOCK TABLE _timescaledb_config.bgw_job IN SHARE ROW EXCLUSIVE MODE;").await?;
migrator.run(&mut *tx).await?;
tx.commit().await?;

The primary objective is to start a transaction that spans across all migration steps and first and foremost lock the background jobs table. The chosen lock allows concurrent READS on the table, but only allows one transaction to WRITE to the table.

While we could alternatively lock the background jobs table before each migration step and run each of them in its own transaction, it would only increase contention on the table because all tests would try to lock the same object for every migration step. Locking it and running all migrations at once is faster.

The long awaited victory

Running the test suite now would still pose problems sometimes, but unlike previous cases deadlocks would now happen when DROP SCHEMA is invoked. Therefore we have to add the background jobs table lock in our teardown routine as well. Many succesive test suite runs were now succesful. Given that many things here were experiments, I went ahead and removed the the intermediary migration steps and reverting the TimescaleDB Docker image used to timescale/timescaledb-ha:pg15-latest.

So what about test isolation? Well, I would’ve never seen the same relation ID in logs if not for schema based isolation and that’s what led me to try locking the _timescaledb_config.bgw_job table. This increased debuggability as well as the reduced number of background jobs were good enough reasons to keep using this approach.

After a couple of weeks, though, deadlocks between some TimescaleDB internals started happening every now and then when dropping test schemas. The frequency was very low in CI; so low that a subsequent run was pretty much guaranteed to succeed. Also, regardless of how few resources I’d give to a TimescaleDB Docker container it would never happen locally. Therefore, we decided to switch back to database test isolation instead hoping that it bypasses some, if not all, of the TimescaleDB internals involved here.

The problem seems to be caused by friction on some advisory locks when dropping a schema, which also deletes background jobs, and some of the background jobs are already running. I reported it to the TimescaleDB team here and not only did they reply quickly but they even shared ideas to further test in our quest to getting consistent test suite runs in CI. We’ll experiment with them and see what comes out of it.

Conclusion

This brings us to the end of our journey. It’s interesting how something so seemingly simple can raise such convoluted problems.

It all started from not wanting to rely on the DATABASE_URL environment variable and aiming to make tests concurrent and we ended up writing wrapper proc macros, creating our own TestFn SQLx test harness implementation and debugging deadlocks on TimescaleDB internals. The result? Dramatically reduced testing time due to running tests concurrently!

I hope you enjoyed reading this article and that the ideas contained within will come in handy at some point. If instead you wanted to find out more about IoT devices cybersecurity, feel free to check out our project on GITHUB :)