DEV Community

Kevin Kinsey
Kevin Kinsey

Posted on

Make Cronjobs Take Their Turn

Not every company has racks of servers to devote to a single web property. While we're experiencing record numbers of sessions and pageloads on OMBE.com at present, we've still not decided to bite the bullet and shell out for load-balancing or a CDN and the attendant headaches of pushing everything from single-server orientation into multi-server-ready.

A Server Farm

So, when we began experiencing load spikes without corresponding traffic spikes, we had some investigation to do.

What we found was that we had several labor-intensive scripts running from cron(8), doing DB maintenance and stuff (such as generating static HTML from a large dataset) ... and they were all contending for system resources. Here's what our crontab file used to look like (and this is what a LOT of examples of crontab files look like):

#min    hr      day     month   wkday   command
*/5     *       *       *       *       /etc/scripts/kill_lockfile
5       4       *       *       *       /etc/scripts/dump_db
30      1       */2     *       *       /etc/scripts/make_sitemaps
0,30    *       *       *       *       /etc/scripts/make_all_filters
*/7     *       *       *       *       /etc/scripts/fix_brandnames
55,25   *       *       *       *       /etc/scripts/fix_foo
48,18   *       *       *       *       /etc/scripts/fix_bar
35,5    *       *       *       *       /etc/scripts/create_baz
                                        .... and many more ...

So, make_sitemaps is DB-intensive, and so are dump_db, make_all_filters and fix_brandnames, and probably some other "fix_this" scripts. God forbid they ever run at the same time ... but sooner or later, with a crontab(5) like this, they will, and they did. It made pages slow to a virtual crawl ... and that's not good for your visitors.

So, what's the answer? Simplify!

Here's what our crontab(5) looks like now:

#min    hr      day     month   wkday   command
*/5      *       *       *       *     /etc/scripts/frequent
59       *       *       *       *     /etc/scripts/hourly
02       0       *       *       *     /etc/scripts/daily

Each of these scripts is a wrapper that calls all the jobs we want to do daily, hourly, or frequently in sequence. So, none of the jobs run concurrently.

Now, how do we keep frequent from running when hourly is running? We create and hold a lockfile in /tmp.

Here's the top of daily --- PLEASE NOTE: this is pseudo-code that resembles the love-child of PHP & Shell (perhaps with a little perl too); you'll want to implement this in your language of choice.

$lockfile = '/tmp/cronlock';

while (file_exists($lockfile)) {
   sleep 15;
}

echo daily > $lockfile;

At the end of daily we erase our lockfile.

hourly looks much the same. Although we put a little insurance at the top in case our hourly jobs take ... a full hour:

//check for the lockfile, sleep if exists
if (file_get_contents($lockfile) == "hourly") {
  // we are waiting on an instance of ourself!
  exit;
}

while (file_exists($lockfile)) {
   sleep(60); // wait for the lockfile to disappear; check every minute
}

frequent is a little different; since it runs several times an hour, we don't sleep if the lockfile's present. We just exit ... after all, cron will run our script again in 5 minutes!

if (file_exists($lockfile)) {
    exit();
}

So, now we have all our maintenance scripts playing nice with each other, running in sequence and taking turns. The server's load average may still spike, but it will be because of traffic, not because of our maintenance jobs.

Let me know if this is helpful, and have a great WEYTI!

Top comments (0)