DEV Community

Alexey
Alexey

Posted on

PM2 + Express + NextJS (with GitHub source): zero downtime deploys

This article builds on a previous article of mine, which introduced a basic Express+NextJS setup that enabled hosting both a React-based front end and API on one service - reducing distributed system hassles.

This article moves that setup closer to production. The key feature is zero-downtime deploys via PM2 - but it also introduces logging via log4js, and initializes PM2 in a way that would be compatible with setting up database connections and other async configuration.

And since this setup is production-ready, I've hosted it as a demo on an EC2 instance in AWS: https://nextjs-express.alexey-dc.com/

The source code

Like the previous template, I open sourced this under the MIT license - so you're free to use it for commercial and closed-source projects, and I would of course appreciate attribution.

https://github.com/alexey-dc/pm2_nextjs_express_template

The details for launching can be found in the README.md.

It inherits the same basic setup/pages, but has more sophisticated configuration on launch - that works with PM2. I'll dive into a few details here.

Zero downtime deploys

Two of the most common strategies for deploys without downtime are blue-green deployments and rolling deployments.

PM2 enables rolling deployments on a single machine.

This is possible because it allows running multiple threads running the same server code via cluster mode, which can be replaced one by one.

Here's an example of a sequence of commands that can achieve a rolling update with PM2:

# Launch 2 instances of a server defined under index.js (-i 2)
pm2 start index.js --name pm2_nextjs_express -i 2
# Perform rolling update with the latest code:
# First kill and replace the first instance, then the second
pm2 reload pm2_nextjs_express
Enter fullscreen mode Exit fullscreen mode

Graceful PM2 setup

Here's how the template actually launches:

pm2 start index.js --name pm2_nextjs_express --wait-ready --kill-timeout 3000 -i 2
Enter fullscreen mode Exit fullscreen mode

There are 2 additional flags: --wait-ready and --kill-timeout - they allow graceful booting and cleaning up.

Let's take a look at some key bits from index.js - which works with those flags. I've slightly modified the code here to focus on the points being made, but you can always read the real source code.

Graceful setup

We let PM2 know that we've completed setup by sending a process.send('ready') signal after all the configuration:

const begin = async () => {
//  ...
  const server = new Server(process.env.EXPRESS_PORT)
  await server.start()
  /*
    Let pm2 know the app is ready
    https://pm2.keymetrics.io/docs/usage/signals-clean-restart/
  */
  if (process.send) {
    process.send('ready')
  }
//  ...
}
begin()
Enter fullscreen mode Exit fullscreen mode

Graceful teardown

During shutdown, PM2 sends a SIGINT signal, and expects us to process.exit(); it waits for --kill-timeout (3000ms in our case), and the sends a SIGKILL.

So to respect that lifecycle and perform cleanup, we listen for the SIGINT signal, perform cleanup, and exit:

  process.on('SIGINT', async () => {
    try {
      await server.stop()
      process.exit(0)
    } catch {
      process.exit(1)
    }
  })
Enter fullscreen mode Exit fullscreen mode

Logging

Since PM2 runs on multiple threads, logging can be challenging. This is why I've included a sample integration of PM2+Log4js.

That does not work out of the box - but log4js explicitly supports a {pm2: true} flag in its configuration.

The log4js docs mention that pm2-intercom is necessary to support this. Using that as-is gives an error due to the process.send('ready') message we send, however:

  4|pm2-intercom  | Error: ID, DATA or TOPIC field is missing
Enter fullscreen mode Exit fullscreen mode

Luckily, there is a fork of pm2-intercom that explicitly addresses this issue https://www.npmjs.com/package/pm2-graceful-intercom

I've also documented this in detail in the log configuration included with the project.

Debugging

I've included a setup for debugging as well.

# This will run on `pnpm debug`
pm2 start index.js --name pm2_nextjs_express_debug --wait-ready --kill-timeout 3000 --node-args='--inspect-brk'
# This will run on `pnpm stop_debug`
pm2 delete pm2_nextjs_express_debug
Enter fullscreen mode Exit fullscreen mode

The --node-args='inspect-brk' flag enables debugging via a socket connection. It's a standard node flag. One great way to work with that debug mode is via Chrome's chrome://inspect. If you don't want to use chrome, just see the official Node.js docs for more options.

You'll notice I don't enable cluster mode for debugging - that's because it doesn't work well.

You'll also notice I launch it on a separate name, don't offer a reload, and the stop involves deleting the process from PM2, vs stopping it - like for the normal run mode. The main reason I did that is because the breakpoints can cause issues for restarts - PM2 will print errors and refuse to boot, and you'll end up having to manually delete the process anyway.

Async configuration

One other opinionated feature I've included in this template is a global namespace for re-usable code.

The reason I did that is two-fold:

  1. There are very often globally configured resources, like database connections, that are shared all across the application - that require async setup when the application launches
  2. There is also often utility code that is shared across the application - that is useful in other contexts, e.g. the debugger (or a repl console)

There are other ways of achieving this than making a global namespace - but I thought it may be more informative to show a specific style of async setup with PM2/Express.

So here is the thinking behind what's going on.

The global backend utility namespace

I expose a global.blib namespace - which is not global.lib, specifically because this setup combines NextJS with Express: with NextJS SSR, React code runs on the backend - thus, if lib is defined on the backend and front end, there will actually be a naming conflict leading to surprising results.

All re-usable/shared backend code lives under app/blib. The logic of pulling in the library is housed under app/blib/_blib.js, so the responsibility of keeping track of files can be encapsulated in the module. Another way of achieving this would be with a package.json file - but I opted for raw JS.

One reason the raw JS is handy is because the initialization logic works well in that same _blib.js file.

Other than pulling in libraries, it also exposes async init() and aynsc cleanup() functions.

Setting up and tearing down the library

The init and cleanup functions naturally plug into the PM2 lifecycle discussed above.

init runs before process.send('ready'):

const blib = require("./app/blib/_blib.js")
// ...
  /*
    If you don't like globals, you can always opt out of this.
    I find it easier to have consistent access across the application
    to often-invoked functionality.
  */
  global.blib = blib
  /*
    This is the only other global I like to expose - since logging is
    most common and most verbose.
  */
  global.log = blib.log
// ...
  /*
    Usually this will at least open database connections.
    In the sample code, a simple in-memory store is initialized instead.
  */
  await blib.init()
  const server = new Server(process.env.EXPRESS_PORT)
  await server.start()
  if (process.send) {
    process.send('ready')
  }
// ...
Enter fullscreen mode Exit fullscreen mode

and cleanup is done in the SIGINT handler:

  process.on('SIGINT', async () => {
    try {
      await server.stop()
      await blib.cleanup()
      process.exit(0)
    } catch {
      log.app.error("Sonething went wrong during shutdown")
      process.exit(1)
    }
  })
Enter fullscreen mode Exit fullscreen mode

Discussion (0)