DEV Community

Natalia Vayngolts for Otomato

Posted on

How to optimize production Docker images running Node.js with Yarn

Usually, Node.js projects contain lots of dependencies. When the project is built, a huge amount of redundant files appear. It may be critical if the application is managed as a Docker image.

Most of the files are excessive and unnecessary for application work, they just take up extra space. For instance, cached data or dev dependencies are always bigger due to modules required during the development workflow.

Sometimes the size of the inessential data reaches up to hundreds of megabytes, so it becomes hard to run Docker images. The bigger image is the more storage it utilizes. Also, the performance of the build and deployment may lack efficiency.

"@nestjs/cli": "^8.2.4",
"@nestjs/common": "^8.4.4",
"@nestjs/core": "^8.4.4",
"@nestjs/jwt": "^8.0.0",
"@nestjs/passport": "^8.2.1",
"@nestjs/platform-express": "^8.4.4",
"@nestjs/serve-static": "^2.2.2",
"@nestjs/swagger": "^5.2.0",
"@nestjs/typeorm": "^8.0.3",
"@sentry/node": "^7.0.0",
"@types/cookie-parser": "^1.4.3",
"bcryptjs": "^2.4.3",
"body-parser": "^1.19.2",
"bull": "^4.7.0",
"class-transformer": "^0.5.1",
"class-validator": "^0.13.2",
"cookie-parser": "^1.4.6",
"cross-env": "^7.0.3",
"dayjs": "^1.11.3",
"dotenv": "^16.0.0",
"express-basic-auth": "^1.2.1",
"flagsmith-nodejs": "^1.1.1",
"jsonwebtoken": "^8.5.1",
"passport": "^0.5.2",
"passport-apple": "^2.0.1",
"passport-facebook": "^3.0.0",
"passport-google-oauth20": "^2.0.0",
"passport-http": "^0.3.0",
"passport-jwt": "^4.0.0",
"passport-local": "^1.0.0",
"pg": "^8.7.3",
"pg-connection-string": "^2.5.0",
"redis": "^4.0.4",
"reflect-metadata": "^0.1.13",
"rimraf": "^3.0.2",
"rxjs": "^7.2.0",
"swagger-ui-express": "^4.3.0",
"typeorm": "0.2",
"uuid": "^8.3.2"
Enter fullscreen mode Exit fullscreen mode

The example_1 represents an unrefined Docker image. Its size on disk is about 1 GB.

Original 1 GB image

And it takes about 900 MB to upload to a registry.

Original 1 GB image to a registry

Contents of the Dockerfile:

FROM node:16.15-alpine
USER node
RUN mkdir -p /home/node/app
WORKDIR /home/node/app
COPY --chown=node . .
RUN yarn install
CMD ["yarn", "start"]
Enter fullscreen mode Exit fullscreen mode

Let’s run the image and check what’s inside the container:

docker run -it --rm example_1 sh
Enter fullscreen mode Exit fullscreen mode

After executing the shell it’s possible to get into the home directory and find out the actual subdirectories size:

~ $ du -d 1 -h
8.0K    ./.yarn
594.3M  ./app
560.9M  ./.cache
1.1G    .
Enter fullscreen mode Exit fullscreen mode

According to the Yarn website,

Yarn stores every package in a global cache in your user directory on the file system.

As one can see, the .cache directory holds backed up packages for offline access and takes about 560 MB. On closer inspection, it’s obvious the folders contain sources of npm dependencies:

cached npm dependencies

ls -la command shows there are 970 entities in common:

~/.cache/yarn/v6 $ ls -la | wc -l
970
Enter fullscreen mode Exit fullscreen mode

A dependency directory may contain something like this:

dependency sources

It’s possible to perform yarn cache clean command to clean the cache folder.

Slight changes in RUN instruction in the Dockerfile

FROM node:16.15-alpine
USER node
RUN mkdir -p /home/node/app
WORKDIR /home/node/app
COPY --chown=node . .
RUN yarn install && yarn cache clean
CMD ["yarn", "start"]
Enter fullscreen mode Exit fullscreen mode

lead to significant changes in the image (example_2) size:

updated image with cache cleaned

As can be seen, the .cache folder is almost empty:

~ $ du -d 1 -h
8.0K    ./.yarn
594.3M  ./app
12.0K   ./.cache
594.3M  .
Enter fullscreen mode Exit fullscreen mode

There’s a way to make the image even smaller. It’s required to install production Node.js dependencies to avoid dev modules that are designated in the development and testing process only. Adding --production flag to the yarn install command:

FROM node:16.15-alpine
USER node
RUN mkdir -p /home/node/app
WORKDIR /home/node/app
COPY --chown=node . .
RUN yarn install --production && yarn cache clean
CMD ["yarn", "start"]
Enter fullscreen mode Exit fullscreen mode

So the image example_3 is more than two times smaller than the original example_1.

updated image with production dependencies

The app folder with production dependencies installed takes 469 MB instead of 594 MB now.

~ $ du -d 1 -h
8.0K    ./.yarn
469.0M  ./app
12.0K   ./.cache
469.1M  .
Enter fullscreen mode Exit fullscreen mode

Another option is to make a multi-stage build and copy solely required artifacts from the image where the build was made.

FROM node:16.15-alpine AS builder

USER node

RUN mkdir -p /home/node/app

WORKDIR /home/node/app

COPY --chown=node . .
# Building the production-ready application code - alias to 'nest build'
RUN yarn install --production && yarn build

FROM node:16.15-alpine

USER node

WORKDIR /home/node/app

COPY --from=builder --chown=node /home/node/app/node_modules ./node_modules
# Copying the production-ready application code, so it's one of few required artifacts
COPY --from=builder --chown=node /home/node/app/dist ./dist
COPY --from=builder --chown=node /home/node/app/public ./public
COPY --from=builder --chown=node /home/node/app/package.json .

CMD [ "yarn", "start" ]
Enter fullscreen mode Exit fullscreen mode

NestJS is used to build the efficient and scalable Node.js application with Typescript.

The example_4 image has almost the same size as the example_3 one:

multi-stage built image

And finally, it takes about 350 MB only to upload to a registry:

multi-stage built image upload

Thus, the image size is reduced more than twice from 1 GB to 460 MB. It takes less storage and time to deploy the application.

Discussion (2)

Collapse
markrity profile image
Mark Davydov • Edited on

Great article for those who are trying to make their image slimmer :)
I will add couple of things:

  1. RUN mkdir -p /home/node/app , you don't really need it , WORKDIR creates the dir if doesn't exists. (thats extra layer)
  2. you don't want to yarn install, it will update all your packages to latest version your symantic version permits, so you don't really know what goes on there also it does updates your yarn.lock file and you don't maintain it for nothing . Usually it is better to use yarn install --frozen-lockfile .
    Also at newer version of yarn yarn install --immutable --immutable-cache --check-cache , used as explained here: yarnpkg.com/cli/install

  3. probably you don't want to use yarn start at your production containers, it can mess the SIGTERM and SIGKILL signals , kubernetes or docker swarm (or any other orchestration tool) will send to your container.
    For more info read here : snyk.io/blog/10-best-practices-to-... , number 5.

also I would suggest using
github.com/wagoodman/dive tool , to dive into your layers and understand where are the big MBs come from.
also slim.ai/ can help you with that

Collapse
sebastian_gieselmann profile image
Sebastian Gieselmann

Thank you for your time. Useful tips / impulses for production oci images 👌