TLDR;
You can use reproducible-containers/buildkit-cache-dance
to reuse cached files in your Docker build and an action like tespkg/actions-cache
or actions/cache
to persist the cache externally. See GitHub repo for an example setup.
Jump to Implementation section to see how everything is set up.
Introduction
Problem Statement
We all know how massive the dependency tree for NPM modules can get. And while PNPM provides significant performance improvements for local development, the same can't always be said for CI pipelines. For instance, in GitHub Actions, it's not uncommon for each build to install the same NPM modules every time since the jobs start with a clean state.
This problem is compounded when we start taking into account modules like node-gyp
that need to be compiled from source, which only increases the amount of time it takes to install the modules.
It's fairly straightforward to cache files in a standard CI pipeline, but it gets more complicated when the modules are installed as part of a Docker build.
Background
In my scenario, the primary problem with downloading / compiling the NPM modules every time wasn't the time wasted. The problem was that there were some native / compiled dependencies that didn't play well with ARM-based CPU, which we used since they're cheaper to run on AWS than their x64 counterparts. The result was flaky builds that failed around 30% of the time and the only workaround was to manually re-run the build until it succeeded.
Caching the modules nearly eliminated the number of build failures. The caching approach worked well since the NPM modules only changed once or twice a week for the project.
Other attempted approaches
- Docker layer caching - required too much disk space and was very volatile
- Base image with NPM modules - complex to implement since a new Docker image would have to be created every time the NPM modules change
Limitations
There are likely better ways to solve this problem, but given the size of the team and the urgency of the problem, we needed a solution that was relatively simple to implement, required minimal maintenance, and could be implemented sooner rather than later.
Deep Dive
As stated in the introduction, caching files in a standard CI pipeline is fairly straightforward. However, it's not as straightforward to do within Docker due to its limitations.
Docker limitations
As of writing (May 2024), Docker only supports externally caching layers, but not cache mounts; the cache mounts are only temporarily available during the build. So if, say, we install NPM modules, there's no native way to access the generated files from outside of Docker.
For instance, RUN --mount=type=cache,target=/pnpm_cache,rw
will correctly cache the files in /pnpm_cache
and will be able to re-use it between builds. However, any state / files generated on a worker is cleared between runs in GitHub Actions, rendering the cache useless for this scenario.
The currently proposed solution is to allow Docker to bind the cache directory in the build to a directory on the host. This way the cache could be persisted externally. However, this issue has been opened for almost 4 years (May 27, 2020) with no clear answer as to whether it'll be implemented any time soon.
This is where the reproducible-containers/buildkit-cache-dance
GitHub Action comes to the rescue! This Action is able to extract the files from the Docker build so they can be persisted on an external storage like S3 and is the approach recommended on the official Docker documentation.
Solution
The solution is to use the reproducible-containers/buildkit-cache-dance
GitHub Action to extract / inject the cache generated by the Docker build and then use tespkg/actions-cache
to save the cache in S3.
Workflow
- After running a Docker build,
reproducible-containers/buildkit-cache-dance
extracts the files from the mounted directory and copies them to a directory on the host machine so it can be accessed outside of the context of Docker -
tespkg/actions-cache
uploads cache to S3. The cached files are compressed and are much smaller (10-20%) than the extracted file. In my experience, ~3GB of cache data for PNPM is compressed to less than 300MB.
[Cache hit scenario]
-
tespkg/actions-cache
downloads cache from S3 and extracts the contents into the provided directory -
reproducible-containers/buildkit-cache-dance
grabs the files from the provided directory and injects them into the Docker build
Implementation
Requirements
- AWS S3 bucket
- AWS IAM user with access to the created bucket
- Dockerfile to build your project
- GitHub Action to build the Docker image
Example setup
Below is a simple setup for caching to S3. I've also set up a GitHub repository with the full setup.
GitHub workflow
---
name: Build
on:
push:
jobs:
Build:
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v4
- uses: docker/setup-buildx-action@v3
- uses: docker/metadata-action@v5
id: meta
with:
images: Build
- name: Cache (S3)
uses: tespkg/actions-cache@v1
id: cache
with:
bucket: ${{ vars.CACHE_BUCKET }}
accessKey: ${{ vars.AWS_ACCESS_KEY }}
secretKey: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
## Fallback to GitHub cache if saving / restoring from S3 fails
use-fallback: true
path: |
pnpm
key: pnpm-cache-${{ hashFiles('pnpm-lock.yaml') }}
restore-keys: |
pnpm-cache-
- name: inject cache into docker
uses: reproducible-containers/buildkit-cache-dance@v3.1.0
with:
cache-map: |
{
"pnpm": "/pnpm"
}
# Skip extraction if cache was hit to avoid unnecessary I/O. This can take minutes for projects with a lot of dependencies.
skip-extraction: ${{ steps.cache.outputs.cache-hit }}
- name: Build
uses: docker/build-push-action@v5
with:
context: .
cache-from: type=gha
cache-to: type=gha,mode=max
file: Dockerfile
push: false
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
Dockerfile
Note that the target
directory used when mounting the cache is the same as the the directory specified in the cache-map
provided to reproducible-containers/buildkit-cache-dance
in the Workflow definition since that's were the cache is injected to and extracted from.
FROM node:21-slim AS base
ENV PNPM_HOME="/pnpm"
RUN corepack enable
# Copy your application code
WORKDIR /app
COPY . .
# Log for troubleshooting. There should be files in the directory when there's a cache hit
RUN --mount=type=cache,target=${PNPM_HOME} echo "PNPM contents before install: $(ls -la ${PNPM_HOME})"
### This is where the magic happens! The cache has been mounted to `$PNPM_HOME` so it can be accessed during the build ####
RUN --mount=type=cache,target=${PNPM_HOME} \
pnpm config set store-dir ${PNPM_HOME} && \
pnpm install --frozen-lockfile --prefer-offline
# Another log for troubleshooting. This should never be empty since the NPM modules were installed before running this line
RUN --mount=type=cache,target=${PNPM_HOME} echo "PNPM contents after install: $(ls -la ${PNPM_HOME})"
FROM node:alpine AS prod
WORKDIR /app
COPY --from=base /app/node_modules /app/node_modules
COPY --from=base /app .
CMD ["npm", "start"]
Cache in action
Cache miss example
Cache hit example
File saved to S3
Conclusion
Although caching NPM modules inside the Docker build worked for my use case, it might not be the best option for you. Because of the time it takes to download the cache, inject the files into the Docker image, and extract the files from the Docker image, this caching approach will likely not yield any performance improvements over just installing the modules without using a cache.
However, if you're looking to solve build failures due to something like compilation failures or NPM rate-limit issues, then caching is a viable solution.
Caveat
Like anything in software development, this approach is subject to become outdated. So look up the latest information on "preserving cache mounts in Docker" in case this has changed.
Top comments (4)
Can you fix the Github workflow example? It is spitting out HTML blocks halfway through and not formatted text.
Thanks for letting me know. Just fixed it ๐
Just so you know, all your images for the "Cache in action" section are all the same.
Otherwise, cool article.
Thanks for pointing it out! Not sure what happened with the upload