Liran Tal for Snyk

Posted on Sep 19, 2022 • Originally published at snyk.io on Sep 14, 2022

10 best practices to containerize Node.js web applications with Docker

#containers #security #docker #node

Editor's note- September 14, 2022: Check out our new and improved cheat sheet for containerizing Node.js web applications with Docker!

Are you looking for best practices on how to build Node.js Docker images for your web applications? Then you’ve come to the right place!

The following article provides production-grade guidelines for building optimized and secure Node.js Docker images. You’ll find it helpful regardless of the Node.js application you aim to build. This article will be helpful for you if:

your aim is to build a frontend application using server-side rendering (SSR) Node.js capabilities for React.
you’re looking for advice on how to properly build a Node.js Docker image for your microservices, running Fastify, NestJS or other application frameworks.

Why did we write this guide on containerizing Node.js Docker web applications?

It might feel like yet another article on how to build Docker images for Node.js applications but many examples we’ve seen in blogs are very simplistic and solely aim to guide you on the basics of having a Node.js Docker image running an application, without thoughtful consideration of security and best practices for building Node.js Docker images.

We are going to learn how to containerize Node.js web applications step by step, starting with a simple and working Dockerfile, understanding the pitfalls and insecurities with every Dockerfile directive, and then fixing it. Download the cheatsheet here.

A simple Node.js Docker image

Most blog articles we’ve seen start and finish along the lines of the following basic Dockerfile instructions for building Node.js Docker images:

FROM node
WORKDIR /usr/src/app
COPY . /usr/src/app
RUN npm install
CMD "npm" "start"

Copy that to a file named Dockerfile, then build and run it.

$ docker build . -t nodejs-tutorial
$ docker run -p 3000:3000 nodejs-tutorial

It’s simple, and it works.

The only problem? It is full of mistakes and bad practices for building Node.js Docker images. Avoid the above by all means.

Let’s begin improving this Dockerfile so we can build optimized Node.js web applications with Docker.

You can follow along with this tutorial by cloning this repository.

Follow these 10 steps to build optimized Node.js web applications with Docker:

1. Use explicit and deterministic Docker base image tags

It may seem to be an obvious choice to build your image based on the node Docker image, but what are you actually pulling in when you build the image? Docker images are always referenced by tags, and when you don’t specify a tag the default, :latest tag is used.

So, in fact, by specifying the following in your Dockerfile, you always build the latest version of the Docker image that has been built by the Node.js Docker working group :

FROM node

The shortcomings of building based on the default node image are as follows:

Docker image builds are inconsistent. Just like we’re using lockfiles to get a deterministic npm install behavior every time we install npm packages, we’d also like to get deterministic docker image builds. If we build the image from node—which effectively means the node:latest tag—then every build will pull a newly built Docker image of node. We don’t want to introduce this sort of non-deterministic behavior.
The node Docker image is based on a full-fledged operating system, full of libraries and tools that you may or may not need to run your Node.js web application. This has two downsides. Firstly a bigger image means a bigger download size which, besides increasing the storage requirement, means more time to download and re-build the image. Secondly, it means you’re potentially introducing security vulnerabilities, that may exist in all of these libraries and tools, into the image.

In fact, the node Docker image is quite big and includes hundreds of security vulnerabilities of different types and severities. If you’re using it, then by default your starting point is going to be a baseline of 642 security vulnerabilities, and hundreds of megabytes of image data that is downloaded on every pull and build.

The recommendations for building better Docker images are:

Use small Docker images — this will translate to a smaller software footprint on the Docker image reducing the potential vulnerability vectors, and a smaller size, which will speed up the image build process
Use the Docker image digest, which is the static SHA256 hash of the image. This ensures that you are getting deterministic Docker image builds from the base image.

We’ve written a comprehensive article on how to choose the best Node.js

Docker image. The article details the reasons for why choosing an up to date Debian’s slim distribution with a long-term support Node.js runtime version is the ideal choice.

The recommended Node.js Docker image to use would be:

FROM node:16.17.0-bullseye-slim

This Node.js Docker image tag uses a specific version of the Node.js runtime (16.17.0) which maps to the current latest Long Term Support. It uses the bullseye image variant which is the current stable Debian 11 version with a far enough end-of-life date. And finally it uses the slim image variant to specify a smaller software footprint of the operating system which results in less than 200MB of image size, including the Node.js runtime and tooling.

That said, one of the common uninformed practices you’ll see is tutorials or guides citing the following Docker instruction for a base image:

FROM node:alpine

These articles cite the use of Node.js Alpine Docker image, but is that really ideal? They do so mostly because it is attributed for the Node.js Alpine Docker image having a smaller software footprint, however, it substantially differs in other traits and that makes it a non-optimal production base image for Node.js application runtimes.

What is Node Alpine?

Node.js Alpine is an unofficial Docker container image build that is maintained by the Node.js Docker team. The Node.js image bundles the Alpine operating system which is powered by the minimal busybox software tooling and the musl C library implementation. These two Node.js Alpine image characteristics contribute to the Docker image being unofficially supported by the Node.js team. Furthermore, many security vulnerabilities scanners can’t easily detect software artifacts or runtimes on Node.js Alpine images, which is counterproductive to efforts to secure your container images.

Regardless of using the Node.js Alpine image tag, using a base image directive in the form of a word alias could still pull new builds of that tag because Docker image tags are mutable. We can find the SHA256 hash for it in the Docker Hub for this Node.js tag, or by running the following command once we pulled this image locally, and locate the Digest field in the output:

$ docker pull node:16.17.0-bullseye-slim
5b1423465504: Already exists
2f232a362cd9: Already exists
aa653d801310: Already exists
25750f98abe8: Already exists
476cb0003ed3: Already exists
Digest: sha256:18ae6567b623f8c1caada3fefcc8746f8e84ad5c832abd909e129f6b13df25b4
Status: Downloaded newer image for node:16.17.0-bullseye-slim
docker.io/library/node:16.17.0-bullseye-slim

Another way to find the SHA256 hash is by running the following command:

$ docker images --digests
REPOSITORY TAG DIGEST IMAGE ID CREATED SIZE
node 16.17.0-bullseye-slim sha256:18ae6567b623f8c1caada3fefcc8746f8e84ad5c832abd909e129f6b13df25b4 f8e42f13e99d 6 days ago 183MB

Now we can update the Dockerfile for this Node.js Docker image as follows:

FROM node@sha256:18ae6567b623f8c1caada3fefcc8746f8e84ad5c832abd909e129f6b13df25b4
WORKDIR /usr/src/app
COPY . /usr/src/app
RUN npm install
CMD "npm" "start"

However, the Dockerfile above, only specifies the Node.js Docker image name without an image tag which creates ambiguity for which exact image tag is being used—it’s not readable, hard to maintain and doesn’t create a good developer experience.

Let’s fix it by updating the Dockerfile, providing the full base image tag for the Node.js version that corresponds to that SHA256 hash:

FROM node:16.17.0-bullseye-slim@sha256:18ae6567b623f8c1caada3fefcc8746f8e84ad5c832abd909e129f6b13df25b4
WORKDIR /usr/src/app
COPY . /usr/src/app
RUN npm install
CMD "npm" "start"

Using the Docker image digest ensures a deterministic image but it could be confusing or counterproductive for some image scanning tools who may not know how to interpret this. For that reason, using an explicit Node.js runtime version such as 16.17.0 is preferred. Even if theoretically it is mutable and can be overridden, in practice, if it needs to receive security or other updates they will be pushed to a new version such as 16.17.1 so it is safe enough to assume deterministic builds.

Therefore, our final proposed Dockerfile at this stage is the following:

FROM node:16.17.0-bullseye-slim
WORKDIR /usr/src/app
COPY . /usr/src/app
RUN npm install
CMD "npm" "start"

2. Install only production dependencies in the Node.js Docker image

The following Dockerfile directive installs all dependencies in the container, including devDependencies, which aren’t needed for a functional application to work. It adds an unneeded security risk from packages used as development dependencies, as well as inflating the image size unnecessarily.

RUN npm install

If you followed my previous guide on 10 npm security best practices then you know that you want to enforce deterministic builds with npm ci. This prevents surprises in a continuous integration (CI) flow because it halts if any deviations from the lockfile are made.

In the case of building a Docker image for production we want to ensure that we only install production dependencies in a deterministic way, and this brings us to the following recommendation for the best practice for installing npm dependencies in a container image:

RUN npm ci --only=production

The updated Dockerfile contents in this stage are as follows:

FROM node:16.17.0-bullseye-slim
WORKDIR /usr/src/app
COPY . /usr/src/app
RUN npm ci --only=production
CMD "npm" "start"

Read our article about software dependencies to learn more.

3. Optimize Node.js tooling for production

When you build your Node.js Docker image for production, you want to ensure that all frameworks and libraries are using the optimal settings for performance and security.

This brings us to add the following Dockerfile directive:

**ENV NODE\_ENV production**

At first glance, this looks redundant, since we already specified only production dependencies in the npm install phase—so why is this necessary?

Developers mostly associate the NODE_ENV=production environment variable setting with the installation of production-related dependencies, however, this setting also has other effects which we need to be aware of.

Some frameworks and libraries may only turn on the optimized configuration that is suited to production if that NODE_ENV environment variable is set to production. Putting aside our opinion on whether this is a good or bad practice for frameworks to take, it is important to know this.

As an example, the Express documentation outlines the importance of setting this environment variable for enabling performance and security related optimizations:

The performance impact of the NODE_ENV variable could be very significant.

The kind folks at Dynatrace have put together a blog post which details the drastic effects of omitting NODE_ENV in your Express applications.

Many of the other libraries that you are relying on may also expect this variable to be set, so we should set this in our Dockerfile.

The updated Dockerfile should now read as follows with the NODE_ENV environment variable setting baked in:

FROM node:16.17.0-bullseye-slim
ENV NODE_ENV production
WORKDIR /usr/src/app
COPY . /usr/src/app
RUN npm ci --only=production
CMD "npm" "start"

4. Don’t run containers as root

The principle of least privilege is a security principle from the early days of Unix and we should always follow this when we’re running our containerized Node.js web applications.

The threat assessment is pretty straight-forward—if an attacker is able to compromise the web application in a way that allows for command injection or directory path traversal, then these will be invoked with the user who owns the application process. If that process happens to be root then they can do virtually everything within the container, including attempting a container escape or privilege escalation. Why would we want to risk it? You’re right, we don’t.

Repeat after me: “friends don’t let friends run containers as root!”

The official node Docker image, as well as its variants like alpine, include a least-privileged user of the same name: node. However, it’s not enough to just run the process as node. For example, the following is not ideal for application function:

USER node
CMD "npm" "start"

The reason for that is the USER Dockerfile directive only ensures that the process is owned by the node user. What about all the files we copied earlier with the COPY instruction? They are owned by root. That’s how Docker works by default.

The complete and proper way of dropping privileges is as follows, also showing our up to date Dockerfile practices up to this point:

FROM node:16.17.0-bullseye-slim
ENV NODE_ENV production
WORKDIR /usr/src/app
COPY --chown=node:node . /usr/src/app
RUN npm ci --only=production
USER node
CMD "npm" "start"

5. Safely terminate Node.js Docker web applications

One of the most common mistakes I see with blogs and articles about containerizing Node.js applications when running in Docker containers is the way that they invoke the process. All of the following and their variants are bad patterns you should avoid:

CMD “npm” “start”
CMD [“yarn”, “start”]
CMD “node” “server.js”
CMD “start-app.sh”

Let’s dig in! I’ll walk you through each of the flawed invoking processes and explain why they should be avoided.

The following concerns are key in order to understanding the context for properly running and terminating Node.js Docker applications:

An orchestration engine, such as Docker Swarm, Kubernetes, or even just the Docker engine itself, needs a way to send signals to the process in the container. Mostly, these are signals to terminate an application, such as SIGTERM and SIGKILL.
The process may run indirectly, and if that happens then it’s not always guaranteed that it will receive these signals.
The Linux kernel treats processes that run as process ID 1 (PID) differently than any other process ID.

Equipped with that knowledge, let’s begin investigating the ways of invoking the process for a container, starting off with the example from the Dockerfile we’re building:

CMD "npm" "start"

The caveat here is two fold. Firstly, we’re indirectly running the node application by directly invoking the npm client. Who’s to say that the npm CLI forwards all events to the node runtime? It actually doesn’t, and we can easily test that.

Make sure that in your Node.js application you set an event handler for the SIGHUP signal which logs to the console every time you’re sending an event. A simple code example should look as follows:

function handle(signal) {
   console.log(`*^!@4=> Received event: ${signal}`)
}
process.on('SIGHUP', handle)

Then run the container, and once it’s up specifically send it the SIGHUP signal using thedocker CLI and the special --signal command line flag:

$ docker kill --signal=SIGHUP elastic_archimedes

Nothing happened, right? That’s because the npm client doesn’t forward any signals to the node process that it spawned.

The other caveat has to do with the different ways in which way you can specify the CMD directive in the Dockerfile. There are two ways, and they are not the same:

the shellform notation, in which the container spawns a shell interpreter that wraps the process. In such cases, the shell may not properly forward signals to your process.
the execform notation, which directly spawns a process without wrapping it in a shell. It is specified using the JSON array notation, such as: CMD [“npm”, “start”]. Any signals sent to the container are directly sent to the process.

Based on that knowledge, we want to improve our Dockerfile process execution directive as follows:

CMD ["node", "server.js"]

We are now invoking the node process directly, ensuring that it receives all of the signals sent to it, without it being wrapped in a shell interpreter.

However, this introduces another pitfall.

When processes run as PID 1 they effectively take on some of the responsibilities of an init system, which is typically responsible for initializing an operating system and processes. The kernel treats PID 1 in a different way than it treats other process identifiers. This special treatment from the kernel means that the handling of a SIGTERM signal to a running process won’t invoke a default fallback behavior of killing the process if the process doesn’t already set a handler for it.

To quote the Node.js Docker working group recommendation on this: “Node.js was not designed to run as PID 1 which leads to unexpected behaviour when running inside of Docker. For example, a Node.js process running as PID 1 will not respond to SIGINT (CTRL-C) and similar signals”.

The way to go about it then is to use a tool that will act like an init process, in that it is invoked with PID 1, then spawns our Node.js application as another process whilst ensuring that all signals are proxied to that Node.js process. If possible, we’d like a small as possible tooling footprint for doing so to not risk having security vulnerabilities added to our container image.

One such tool that we use at Snyk is dumb-init because it is statically linked and has a small footprint. Here’s how we’ll set it up:

RUN apt-get update && apt-get install -y --no-install-recommends dumb-init
CMD ["dumb-init", "node", "server.js"]

This brings us to the following up to date Dockerfile. You’ll notice that we placed the dumb-init package install right after the image declaration, so we can take advantage of Docker’s caching of layers:

FROM node:16.17.0-bullseye-slim
RUN RUN apt-get update && apt-get install -y --no-install-recommends dumb-init
ENV NODE_ENV production
WORKDIR /usr/src/app
COPY --chown=node:node . .
RUN npm ci --only=production
USER node
CMD ["dumb-init", "node", "server.js"]

When we use Docker’s RUN instruction to add software like we did with RUN apt-get update && apt-get install then we leave behind some information on the Docker image. To clean up after this command, we can extend it as follows and keep a slimmer Docker image:

FROM node:16.17.0-bullseye-slim
RUN apt-get update && apt-get install -y --no-install-recommend dumb-init
ENV NODE_ENV production
WORKDIR /usr/src/app
COPY --chown=node:node . .
RUN npm ci --only=production
USER node
CMD ["dumb-init", "node", "server.js"]

Tip: it’s even better to install the dumb-init tool in an earlier build stage image, and then copy the resulting /usr/bin/dumb-init file to the final container image to keep that image clean. We’ll learn more about multistage Docker builds later on this guide.

Good to know: docker kill and docker stop commands only send signals to the container process with PID 1. If you’re running a shell script that runs your Node.js application, then take note that a shell instance—such as /bin/sh, for example—doesn’t forward signals to child processes, which means your app will never get a SIGTERM.

Secure your containerized Node.js web applications

6. Graceful shutdown for your Node.js web applications

If we’re already discussing process signals that terminate applications, let’s make sure we’re shutting them down properly and gracefully without disrupting users.

When a Node.js application receives an interrupt signal, also known as SIGINT, or CTRL+C, it will cause an abrupt process kill, unless any event handlers were set of course to handle it in a different behavior. This means that connected clients to a web application will be immediately disconnected. Now, imagine hundreds of Node.js web containers orchestrated by Kubernetes, going up and down as needs arise to scale or manage errors. Not the greatest user experience.

You can easily simulate this problem. Here’s a stock Fastify web application example, with an inherent delayed response of 60 seconds for an endpoint:

fastify.get('/delayed', async (request, reply) => {
 const SECONDS_DELAY = 60000
 await new Promise(resolve => {
     setTimeout(() => resolve(), SECONDS_DELAY)
 })
 return { hello: 'delayed world' }
})

const start = async () => {
 try {
   await fastify.listen(PORT, HOST)
   console.log(`*^!@4=> Process id: ${process.pid}`)
 } catch (err) {
   fastify.log.error(err)
   process.exit(1)
 }
}

start()

Run this application and once it’s running send a simple HTTP request to this endpoint:

$ time curl http://localhost:3000/delayed

Hit CTRL+C in the running Node.js console window and you’ll see that the curl request exited abruptly. This simulates the same experience your users would receive when containers tear down.

To provide a better experience, we can do the following:

Set an event handler for the various termination signals like SIGINT and SIGTERM.
The handler waits for clean up operations like database connections, ongoing HTTP requests and others.
The handler then terminates the Node.js process.

Specifically with Fastify, we can have our handler call on fastify.close() which returns a promise that we will await, and Fastify will also take care to respond to every new connection with the HTTP status code 503 to signal that the application is unavailable.

Let’s add our event handler:

async function closeGracefully(signal) {
   console.log(`*^!@4=> Received signal to terminate: ${signal}`)

   await fastify.close()
   // await db.close() if we have a db connection in this app
   // await other things we should cleanup nicely
   process.kill(process.pid, signal);
}
process.once('SIGINT', closeGracefully)
process.once('SIGTERM', closeGracefully)

Admittedly, this is more of a generic web application concern than Dockerfile related, but is even more important in orchestrated environments.

7. Find and fix security vulnerabilities in your Node.js docker image

Remember how we discussed the importance of small Docker base images for our Node.js applications? Let’s put this test into practice.

I’m going to use the Snyk CLI to test our Docker image. You can sign up for a free Snyk account here.

$ npm install -g snyk
$ snyk auth
$ snyk container test node:16.17.0-bullseye-slim --file=Dockerfile

The first command installs the Snyk CLI, followed by a quick sign-in flow from the command line to fetch an API key, and then we can test the container for any security issues. Here is the result:

Organization: lirantal
Package manager: deb
Target file: Dockerfile
Project name: docker-image|node
Docker image: node:16.17.0-bullseye-slim
Platform: linux/arm64
Base image: node:lts-bullseye-slim
Licenses: enabled

Tested 97 dependencies for known issues, found 44 issues.

According to our scan, you are currently using the most secure version of the selected base image

Snyk detected 97 operating system dependencies, including our Node.js runtime executable, and did not find any vulnerable versions of the runtime. However, 44 security vulnerabilities do exist for some of the software within the container image. 43 of these dependencies are low severity issues, and 1 is a critical zlib library related vulnerability:

✗ Low severity vulnerability found in apt/libapt-pkg6.0
  Description: Improper Verification of Cryptographic Signature
  Info: https://snyk.io/vuln/SNYK-DEBIAN11-APT-522585
  Introduced through: apt/libapt-pkg6.0@2.2.4, apt@2.2.4
  From: apt/libapt-pkg6.0@2.2.4
  From: apt@2.2.4 > apt/libapt-pkg6.0@2.2.4
  From: apt@2.2.4
  Image layer: Introduced by your base image (node:lts-bullseye-slim)

✗ Critical severity vulnerability found in zlib/zlib1g
  Description: Out-of-bounds Write
  Info: https://snyk.io/vuln/SNYK-DEBIAN11-ZLIB-2976151
  Introduced through: meta-common-packages@meta
  From: meta-common-packages@meta > zlib/zlib1g@1:1.2.11.dfsg-2+deb11u1
  Image layer: Introduced by your base image (node:lts-bullseye-slim)
  Fixed in: 1:1.2.11.dfsg-2+deb11u2

How to fix Docker image vulnerabilities?

One effective and quick way to keep up with secure software in your Docker image is to rebuild the Docker image. You would depend on the upstream Docker base image you use to fetch these updates for you. Another way is to explicitly install OS system updates for packages, including security fixes.

With the official Node.js Docker image, the team may be slower to respond with image updates and so rebuilding the Node.js Docker image 16.17.0-bullseye-slim or lts-bullseye-slim will not be effective. The other option is to manage your own base image with up-to-date software from Debian. In our Dockerfile we can do this as follows:

RUN apt-get update && apt-get upgrade -y

Let’s run the Snyk security scan after building the Node.js Docker image with that newly added RUN instruction:

✗ Low severity vulnerability found in apt/libapt-pkg6.0
  Description: Improper Verification of Cryptographic Signature
  Info: https://snyk.io/vuln/SNYK-DEBIAN11-APT-522585
  Introduced through: apt/libapt-pkg6.0@2.2.4, apt@2.2.4
  From: apt/libapt-pkg6.0@2.2.4
  From: apt@2.2.4 > apt/libapt-pkg6.0@2.2.4
  From: apt@2.2.4
  Image layer: Introduced by your base image (node:16.17.0-bullseye-slim)
…
Tested 98 dependencies for known issues, found 43 issues.
According to our scan, you are currently using the most secure version of the selected base image

It resulted in one more OS dependency added (98 vs 97 before), but now all 43 security vulnerabilities that impact this Node.js Docker image are of low severity, and we’ve remediated the critical zlib security vulnerability. That’s a great win for us!

What would happen if we had used the FROM node base image directive?

Even better, let’s assume you had used a more specific Node.js Docker base image, such as this:

FROM node:14.2.0-slim

…

✗ High severity vulnerability found in node
  Description: Memory Corruption
  Info: https://snyk.io/vuln/SNYK-UPSTREAM-NODE-570870
  Introduced through: node@14.2.0
  From: node@14.2.0
  Introduced by your base image (node:14.2.0-slim)
  Fixed in: 14.4.0

✗ High severity vulnerability found in node
  Description: Denial of Service (DoS)
  Info: https://snyk.io/vuln/SNYK-UPSTREAM-NODE-674659
  Introduced through: node@14.2.0
  From: node@14.2.0
  Introduced by your base image (node:14.2.0-slim)
  Fixed in: 14.11.0

Organization: snyk-demo-567
Package manager: deb
Target file: Dockerfile
Project name: docker-image|node
Docker image: node:14.2.0-slim
Platform: linux/amd64
Base image: node:14.2.0-slim

Tested 78 dependencies for known issues, found 82 issues.

Base Image Vulnerabilities Severity
node:14.2.0-slim 82 23 high, 11 medium, 48 low

Recommendations for base image upgrade:

Minor upgrades
Base Image Vulnerabilities Severity
node:14.15.1-slim 71 17 high, 7 medium, 47 low

Major upgrades
Base Image Vulnerabilities Severity
node:15.4.0-slim 71 17 high, 7 medium, 47 low

Alternative image types
Base Image Vulnerabilities Severity
node:14.15.1-buster-slim 55 12 high, 4 medium, 39 low
node:14.15.3-stretch-slim 71 17 high, 7 medium, 47 low

While it seems that a specific Node.js runtime version such as FROM node:14.2.0-slim is good enough because you specified a specific version (the 14.2.0) and also the use of a small container image (due to the slim image tag), Snyk is able to find security vulnerabilities from 2 primary sources:

The Node.js runtime itself—did you notice the two leading security vulnerabilities in the report above? These are publicly known security issues in the Node.js runtime. The immediate fix to these would be to upgrade to a newer Node.js version, which Snyk tells you about and also tells you which version fixed it—14.11.0, as you can see in the output.
Tooling and libraries installed in this debian base image, such as glibc, bzip2, gcc, perl, bash, tar, libcrypt and others. While these vulnerable versions in the container may not pose an immediate threat, why have them if we’re not using them?

The best part of this Snyk CLI report? Snyk also recommends other base images to switch to, so you don’t have to figure this out yourself. Finding alternative images could be very time consuming, so Snyk saves you that trouble.

My recommendation at this stage is as follows:

If you are managing your Docker images in a registry, such as Docker Hub or Artifactory, you can easily import them into Snyk so that the platform finds these vulnerabilities for you. This will also give you recommendation advice in the Snyk UI as well as monitoring your Docker images on an ongoing basis for newly discovered security vulnerabilities.
Use the Snyk CLI in your CI automation. The CLI is very flexible and that’s exactly why we created it—so you can apply it to any custom workflows you have. We also have Snyk for GitHub Actions, if you fancy those.

For more ways to manage vulnerabilities in container images, check out our Container Security guide.

8. Use multi-stage builds

Multi-stage builds are a great way to move from a simple, yet potentially erroneous Dockerfile, into separated steps of building a Docker image, so we can avoid leaking sensitive information. Not only that, but we can also use a bigger Docker base image to install our dependencies, compile any native npm packages if needed, and then copy all these artifacts into a small production base image, like our alpine example.

Prevent sensitive information leak

The use-case here to avoid sensitive information leakage is more common than you think.

If you’re building Docker images for work, there’s a high chance that you also maintain private npm packages. If that’s the case, then you probably needed to find some way to make that secret NPM_TOKEN available to the npm install.

Here’s an example for what I’m talking about:

FROM node:16.17.0-bullseye-slim

RUN apt-get update && apt-get install -y --no-install-recommend dumb-init
ENV NODE_ENV production
ENV NPM_TOKEN 1234
WORKDIR /usr/src/app
COPY --chown=node:node . .
#RUN npm ci --only=production
RUN echo "//registry.npmjs.org/:_authToken=$NPM_TOKEN" > .npmrc && \
   npm ci --only=production
USER node
CMD ["dumb-init", "node", "server.js"]

Doing this, however, leaves the .npmrc file with the secret npm token inside the Docker image. You could attempt to improve it by deleting it afterwards, like this:

RUN echo "//registry.npmjs.org/:_authToken=$NPM_TOKEN" > .npmrc && \
   npm ci --only=production
RUN rm -rf .npmrc

However, now the .npmrc file is available in a different layer of the Docker image. If this Docker image is public, or someone is able to access it somehow, then your token is compromised. A better improvement would be as follows:

RUN echo "//registry.npmjs.org/:_authToken=$NPM_TOKEN" > .npmrc && \
   npm ci --only=production; \
   rm -rf .npmrc

The problem now is that the Dockerfile itself needs to be treated as a secret asset, because it contains the secret npm token inside it.

Luckily, Docker supports a way to pass arguments into the build process:

ARG NPM_TOKEN
RUN echo "//registry.npmjs.org/:_authToken=$NPM_TOKEN" > .npmrc && \
   npm ci --only=production; \
   rm -rf .npmrc

And then we build it as follows:

**$ docker build . -t nodejs-tutorial --build-arg NPM\_TOKEN=1234**

I know you were thinking that we’re all done at this point but, sorry to disappoint.

That’s how it is with security—sometimes the obvious things are yet just another pitfall.

What’s the problem now, you ponder? Build arguments passed like that to Docker are kept in the history log. Let’s see with our own eyes. Run this command:

**$ docker history nodejs-tutorial**

which prints the following:

IMAGE CREATED CREATED BY SIZE COMMENT
b4c2c78acaba About a minute ago CMD ["dumb-init" "node" "server.js"] 0B buildkit.dockerfile.v0
<missing> About a minute ago USER node 0B buildkit.dockerfile.v0
<missing> About a minute ago RUN |1 NPM_TOKEN=1234 /bin/sh -c echo "//reg… 5.71MB buildkit.dockerfile.v0
<missing> About a minute ago ARG NPM_TOKEN 0B buildkit.dockerfile.v0
<missing> About a minute ago COPY . . # buildkit 15.3kB buildkit.dockerfile.v0
<missing> About a minute ago WORKDIR /usr/src/app 0B buildkit.dockerfile.v0
<missing> About a minute ago ENV NODE_ENV=production 0B buildkit.dockerfile.v0

Did you spot the secret npm token there? That’s what I mean.

There’s a great way to manage secrets for the container image, but this is the time to introduce multi-stage builds as a mitigation for this issue, as well as showing how we can build minimal images.

Introducing multi-stage builds for Node.js Docker images

Just like that principle in software development of Separation of Concerns, we’ll apply the same ideas in order to build our Node.js Docker images. We’ll have one image that we use to build everything that we need for the Node.js application to run, which in a Node.js world, means installing npm packages, and compiling native npm modules if necessary. That will be our first stage.

The second Docker image, representing the second stage of the Docker build, will be the production Docker image. This second and last stage is the image that we actually optimize for and publish to a registry, if we have one. That first image that we’ll refer to as the build image, gets discarded and is left as a dangling image in the Docker host that built it, until it gets cleaned.

Here is the update to our Dockerfile that represents our progress so far, but separated into two stages:

# --------------> The build image
FROM node:latest AS build
RUN apt-get update && apt-get install -y --no-install-recommend dumb-init
ARG NPM_TOKEN
WORKDIR /usr/src/app
COPY package*.json /usr/src/app/
RUN echo "//registry.npmjs.org/:_authToken=$NPM_TOKEN" > .npmrc && \
   npm ci --only=production && \
   rm -f .npmrc

# --------------> The production image
FROM node:16.17.0-bullseye-slim

ENV NODE_ENV production
COPY --from=build /usr/bin/dumb-init /usr/bin/dumb-init
USER node
WORKDIR /usr/src/app
COPY --chown=node:node --from=build /usr/src/app/node_modules /usr/src/app/node_modules
COPY --chown=node:node . /usr/src/app
CMD ["dumb-init", "node", "server.js"]

As you can see, I chose a bigger image for the build stage because I might need tooling like gcc (the GNU Compiler Collection) to compile native npm packages, or for other needs.

In the second stage, there’s a special notation for the COPY directive that copies the node_modules/ folder from the build Docker image into this new production base image.

Also, now, do you see that NPM_TOKEN passed as build argument to the build intermediary Docker image? It’s not visible anymore in the docker history nodejs-tutorial command output because it doesn’t exist in our production docker image.

9. Keeping unnecessary files out of your Node.js Docker images

You have a .gitignore file to avoid polluting the git repository with unnecessary files, and potentially sensitive files too, right? The same applies to Docker images.

What is a Docker ignore file?

Docker has a .dockerignore which will ensure it skips sending any glob pattern matches inside it to the Docker daemon. Here is a list of files to give you an idea of what you might be putting into your Docker image that we’d ideally want to avoid:

– .dockerignore

– node_modules

– npm-debug.log

– Dockerfile

– .git

– .gitignore

As you can see, the node_modules/ is actually quite important to skip because if we hadn’t ignored it, then the simplistic Dockerfile version that we started with would have caused the local node_modules/ folder to be copied over to the container as-is.

FROM node:16.17.0-bullseye-slim
WORKDIR /usr/src/app
COPY . /usr/src/app
RUN npm install
CMD "npm" "start"

In fact, it’s even more important to have a .dockerignore file when you are practicing multi-stage Docker builds. To refresh your memory on how the 2nd stage Docker build looks like:

# --------------> The production image
FROM node:16.17.0-bullseye-slim

ENV NODE_ENV production
COPY --from=build /usr/bin/dumb-init /usr/bin/dumb-init
USER node
WORKDIR /usr/src/app
COPY --chown=node:node --from=build /usr/src/app/node_modules /usr/src/app/node_modules
COPY --chown=node:node . /usr/src/app
CMD ["dumb-init", "node", "server.js"]

The importance of having a .dockerignore is that when we do a COPY . /usr/src/app from the 2nd Dockerfile stage, we’re also copying over any local node_modules/ to the Docker image. That’s a big no-no as we may be copying over modified source code inside node_modules/.

On top of that, since we’re using the wild-card COPY .we may also be copying into the Docker image sensitive files that include credentials or local configuration.

The take-away here for a .dockerignore file is:

Skip potentially modified copies of node_modules/ in the Docker image.
Saves you from secrets exposure such as credentials in the contents of .env or aws.json files making their way into the Node.js Docker image.
It helps speed up Docker builds because it ignores files that would have otherwise caused a cache invalidation. For example, if a log file was modified, or a local environment configuration file, all would’ve caused the Docker image cache to invalidate at that layer of copying over the local directory.

10. Mounting secrets into the Docker build image

One thing to note about the .dockerignore file is that it is an all or nothing approach and can’t be turned on or off per build stages in a Docker multi-stage build.

Why is it important? Ideally, we would want to use the .npmrc file in the build stage, as we may need it because it includes a secret npm token to access private npm packages. Perhaps it also needs a specific proxy or registry configuration to pull packages from.

This means that it makes sense to have the .npmrc file available to the build stage—however, we don’t need it at all in the second stage for the production image, nor do we want it there as it may include sensitive information, like the secret npm token.

One way to mitigate this .dockerignore caveat is to mount a local file system that will be available for the build stage, but there’s a better way.

Docker supports a relatively new capability referred to as Docker secrets, and is a natural fit for the case we need with .npmrc. Here is how it works:

When we run the docker build command we will specify command line arguments that define a new secret ID and reference a file as the source of the secret.
In the Dockerfile, we will add flags to the RUN directive to install the production npm, which mounts the file referred by the secret ID into the target location—the local directory .npmrc file which is where we want it available.
The .npmrc file is mounted as a secret and is never copied into the Docker image.
Lastly, let’s not forget to add the .npmrc file to the contents of the .dockerignore file so it doesn’t make it into the image at all, for either the build nor production images.

Let’s see how all of it works together. First the updated .dockerignore file:

.dockerignore
node_modules
npm-debug.log
Dockerfile
.git
.gitignore
.npmrc

Then, the complete Dockerfile, with the updated RUN directive to install npm packages while specifying the .npmrc mount point:

# --------------> The build image
FROM node:latest AS build
RUN apt-get update && apt-get install -y --no-install-recommend dumb-init
WORKDIR /usr/src/app
COPY package*.json /usr/src/app/
RUN --mount=type=secret,mode=0644,id=npmrc,target=/usr/src/app/.npmrc npm ci --only=production

# --------------> The production image
FROM node:16.17.0-bullseye-slim

ENV NODE_ENV production
COPY --from=build /usr/bin/dumb-init /usr/bin/dumb-init
USER node
WORKDIR /usr/src/app
COPY --chown=node:node --from=build /usr/src/app/node_modules /usr/src/app/node_modules
COPY --chown=node:node . /usr/src/app
CMD ["dumb-init", "node", "server.js"]

And finally, the command that builds the Node.js Docker image:

$ docker build . -t nodejs-tutorial --secret id=npmrc,src=.npmrc

Note: Secrets are a new feature in Docker and if you’re using an older version, you might need to enable it Buildkit as follows:

$ DOCKER_BUILDKIT=1 docker build . -t nodejs-tutorial --build-arg NPM_TOKEN=1234 --secret id=npmrc,src=.npmrc

Summary

You made it all the way to create an optimized Node.js Docker base image. Great job!

That last step wraps up this entire guide on containerizing Node.js Docker web applications, taking into consideration performance and security related optimizations to ensure we’re building production-grade Node.js Docker images!

Follow-up resources that I highly encourage you to review:

Once you build secure and performant Docker base images for your Node.js applications – find and fix your container vulnerabilities with a free Snyk account.

DEV Community