Minion is a high performance job queue designed for use with Mojolicious. In this article i will introduce you to one of the new features in the 10.17 release.
The Problem
Recently in one of my work projects at SUSE i ran into a job queue congestion issue. We have a lot of very slow background jobs to perform various maintenance tasks, such as cleaning up old files from disk that are no longer needed. Some of these jobs can take over an hour to finish and they are not particularly time critical, so very low priority.
Now, due to some bad luck we had a large backlog of these one hour jobs. And at some point there were no higher priority jobs left in the queue. So the Minion worker ended up processing only one hour jobs at full capacity.
This wouldn't have been a problem if we didn't also have time critical high priority jobs come in right at that moment. With no extra capacity left, they had to wait the full hour. And since those high priority jobs were created by users of the web service, they were of course assuming something was broken when they didn't receive a result in a timely manner. And we ended up with a whole bunch of bug reports.
A Fast Lane
Of course this is not a particularly uncommon problem. Larger web services would simply use multiple Minion workers with different named queues for various service levels. Probably with different hardware for paying customers and those using a free plan. But this is a smaller service that some users also install on their own machines locally. So one Minion worker is more than enough under normal circumstances.
What we needed was a simpler solution. The prefork web server that ships with Mojolicious has this concept of spare processes. Where under heavy load it is allowed to spawn a few extra processes for request handling at peak times. While normally it would only keep a smaller number of processes preforked.
The same concept can also be applied to the Minion worker. We simply set aside a small number of jobs from our maximum capacity only to be used for high priority jobs. So instead of just a --jobs 4
setting, we now also have a --spare 2
setting. And at peak times there will be 6 jobs running in parallel, of which at least 2 are guaranteed to be high priority. What exactly is considered high priority can be controlled with the --spare-min-priority 5
setting, which will default to 1
, since the default priority for all jobs is 0
and lower priority jobs are usually given negative priority numbers.
$ script/myapp minion worker --jobs 4 --spare 2
[2021-03-06 13:50:22.02638] [30272] [info] Worker 30272 started
...
The Implementation
Since Minion relies on PostrgeSQL the actual implementation was very simple. Just two lines of Perl to spawn spare processes and a one line change in the SQL query used to dequeue jobs.
UPDATE minion_jobs SET started = NOW(), state = 'active', worker = ?
WHERE id = (
SELECT id FROM minion_jobs AS j
WHERE delayed <= NOW() AND id = COALESCE(?, id) AND (parents = '{}' OR NOT EXISTS (
SELECT 1 FROM minion_jobs WHERE id = ANY (j.parents) AND (
state = 'active' OR (state = 'failed' AND NOT j.lax)
OR (state = 'inactive' AND (expires IS NULL OR expires >
- )) AND queue = ANY (?) AND state = 'inactive' AND task = ANY (?) AND (EXPIRES IS NULL OR expires > NOW())
+ )) AND priority >= COALESCE(?, priority) AND queue = ANY (?) AND state = 'inactive' AND task = ANY (?)
+ AND (EXPIRES IS NULL OR expires > NOW())
ORDER BY priority DESC, id
LIMIT 1
FOR UPDATE SKIP LOCKED
)
RETURNING id, args, retries, task;
Luckily the index we use for the ORDER BY
also works for the minimum priority check, so no further optimisations were needed. We can still dequeue thousands of jobs per second. I love PostgreSQL.
Top comments (3)
what type of job is good for Job Queue?
I always yet use crontab to do my job.
docs.mojolicious.org/Minion#DESCRI...
thank you for topic link.