DEV Community

Cover image for How to synchronize Strapi cron tasks
Oleg Kondrakhanov
Oleg Kondrakhanov

Posted on

How to synchronize Strapi cron tasks

Hello and let's get straight to the point.

Strapi is great headless CMS. Besides, its cron module can be so useful in certain cases, for example, regular fetching some data from 3rd-party API. But there is a little problem.

A little problem

Everything works fine if we stick to a single-process configuration, i.e. single database and a single Strapi app instance using it. However today we use containers and orchestration tools and infrastructure can be scaled quite easy, multiple application instances can be created in the blink of an eye. So the code should be written with these things in mind.

Imagine we run 3 Strapi instances as a website back-end. 3 instances mean 3 separate cron tasks running at the same time. Do we really need all 3 of them? And what's more important - should we expect any bug crawling here?

Here is a real-world case as an example. We needed to add internationalization for our website and that requirement also included translation of CMS-stored content. We chose Lokalise.com as a localization platform as it allows involving translators from outside the company staff without granting them access to a CMS itself. The plan was:

  1. English (default language) content is stored directly in Strapi database so content managers could edit it via the admin panel just like they used to.
  2. After content is edited, Strapi uploads changes to Lokalise.com so translators could work on it.
  3. Strapi cron task fetches translated content on a regular basis and stores it in special Locale model.
  4. A Strapi middleware checks requests' query parameters and substitutes text content using the Locale model if non-default language was requested.

So cron module looked something like this
/config/functions/cron.js

const { updateLocales } = require("../../lib/locale");

module.exports = {
  "*/10 * * * *": () => {
    updateLocales();
  }
}
Enter fullscreen mode Exit fullscreen mode

After we deployed all this to a staging environment I checked logs and what I found was that instead of one cron task launching every 10 minutes there were three of them. What's more, two of them were throwing exceptions as Lokalise.com API doesn't allow simultaneous requests with the same API token.
We got three cron tasks because there are three Strapi application instances in the environment, that's the answer.

So now I needed to synchronize several cron tasks to allow only one to be executed. And no, I didn't plan to give up Strapi cron module entirely, replacing it by system cron or something similar. Strapi cron still has access to built-in strapi object, its services, controllers and models which is a nice benefit.

Solution

In a nutshell, we'll be using a special Lock model and block access to it while a task is in progress.

A Lock model

First, let's create this model. It is pretty simple, there is only one text field - Task, which is a Task we would like to acquire a lock for. Here is Strapi model config, all routes are default.

/api/lock/models/lock.settings.json

{
  "kind": "collectionType",
  "collectionName": "locks",
  "info": {
    "name": "Lock",
    "description": ""
  },
  "options": {
    "increments": true,
    "timestamps": true,
    "draftAndPublish": true
  },
  "attributes": {
    "Task": {
      "type": "string",
      "unique": true
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Acquiring the lock

Next part is a bit tricky. Our database is PostgreSQL so we should use its connector knex directly to write a locking code. Luckily Strapi provides a convenient interface to this connector as strapi.connections.default.

I extracted the function to a standalone module.

/lib/lock.js

const lockTask = async (taskName, task) => {
  const knex = strapi.connections.default;
  await knex.transaction(async (t) => {
    try {
      const response = await knex("locks")
      .where({ Task: taskName })
      .select("*")
      .transacting(t)
      .forUpdate()
      .noWait();

      if (!response.length) {
        await t.insert({ Task: taskName }).into("locks");
      }

      await task();

      return true;
    } catch (err) {
      return false;
    }
  });
};

module.exports = {
  lockTask,
};
Enter fullscreen mode Exit fullscreen mode

This lockTask function has only two arguments. First one is the name of the task to acquire a lock for. It corresponds to a Name field of the Lock Strapi model. The second - task is an async function called in case a lock is acquired.
At the beginning we should get knex object as

const knex = strapi.connections.default;
Enter fullscreen mode Exit fullscreen mode

Then we call knex.transaction to begin a transaction and pass a transaction handler function as its only argument.
The locking job happens here

const response = await knex("locks")
  .where({ Task: taskName }).select("*")
  .transacting(t)
  .forUpdate()
  .noWait();
Enter fullscreen mode Exit fullscreen mode

We are trying to select a locks table row with a specific Task value. Calling transacting(t) signifies that the query should be a part of transaction t. (You can read here for better understanding). We also specify forUpdate clause to indicate that no other similar query should be allowed while transaction is in progress. See PostgreSQL docs

FOR UPDATE causes the rows retrieved by the SELECT statement to be locked as though for update. This prevents them from being modified or deleted by other transactions until the current transaction ends. That is, other transactions that attempt UPDATE, DELETE, or SELECT FOR UPDATE of these rows will be blocked until the current transaction ends.

And finally we add noWait option to prevent waiting for other transactions to be finished

With NOWAIT, the statement reports an error, rather than waiting, if a selected row cannot be locked immediately.

To sum up, now only one Strapi app instance would be able to get past this query, i.e. obtain the lock. All other would go straight to the catch block.

The first time we lock a task, there is no corresponding Lock record so it must be created

  if (!response.length) {
    await t.insert({ Task: taskName }).into("locks");
  }
Enter fullscreen mode Exit fullscreen mode

However as there was no actual lock first time, all of Strapi app instances would be able to execute this insert query. That's why Task field of Lock model should be declared as unique, so no duplicates anyway.

Now the time for task itself to be processed

 await task();
Enter fullscreen mode Exit fullscreen mode

And that's all.

Wrapping cron tasks ...

Now we need just to wrap our cron task with the locking function
/config/functions/cron.js

const { updateLocales } = require("../../lib/locale");
const { lockTask } = require("../../lib/lock");

module.exports = {
  "*/10 * * * *": () => {
    lockTask("locales", updateLocales);
  }
}
Enter fullscreen mode Exit fullscreen mode

... and non-cron tasks

That approach might also be useful if you use Strapi bootstrap function and want to perform some work only once.
/config/functions/bootstrap.js

module.exports = async () => {
  await lockTask("bootstrap", async () => {
    await somePreparationFunction();
    await andAnotherFunction();
    await andYetAnotherFunction();
  });
};
Enter fullscreen mode Exit fullscreen mode

After these fixes were deployed to a staging environment and I checked logs once again, they showed only one application instance was performing the actual task. Just as planned.

Top comments (3)

Collapse
 
pravosud profile image
Sergey Pravosud

Great article, Oleg! I had the same problem with running a single task when there were running several application instances in k8s. I would like to share my experience. If you are using orchestration like k8s you just be able to use the k8s cron job task which invokes Strapi API method that doing some certain logic for your task. What do you think about it?

Collapse
 
olekon profile image
Oleg Kondrakhanov

Hi, Sergey. Right now we only started to move to k8s, so we hadn't used it actively at the time this article was published. But still I think it's easier to utilize strapi cron module as it already has all the neceesary access to your models and controllers, not just external API call.
But I'd be glad to hear about your experience anyway.

Collapse
 
allealdine profile image
Alle Aldine

Superb!, I faced the same situation and it helped me a lot.