This post describes how to implement dependency manager caching for a Production Docker builds.
Context
Currently what most guides/posts offer on the subject is leveraging Docker image layer caching when building docker images.
It looks a little like this:
FROM ruby
ADD Gemfile* /app
RUN bundle install
ADD . /app
As long as there are no changes to the Gemfile
the bundle install
command will use its cache to quickly skip over this step.
The Dockerfile in the example is valid and works. However, there is a performance penalty when gems are added or removed in the Gemfile. Because no matter how small the change, the entire cache for the step is invalidated. Downloading and installing all gems every time something changes can be especially tedious on some slower machines.
I'm using bundler as an example but this problem exists for most dependency managers.
Docker experimental feature
After some digging around it appears there is an experimental docker feature for this problem.
What this can do is mount a directory to a given target which will be attached on each docker build. Which is perfect for dependency managers as it can use its already downloaded gems, node_modules or whatever.
This stackoverflow answer also explains how it works very well.
As this is an experimental feature it needs to be enabled for the Docker daemon and the client. Have a search online how to enable this for your system.
An extra line at the top of the Dockerfile is needed indicating you are using experimental features.
Examples
Bundler
# syntax = docker/dockerfile:experimental
...
RUN --mount=target=/app/tmp/bundle,type=cache \
bundle install --deployment --path /app/tmp/bundle --without development test \
&& cp -r tmp/bundle/ vendor
...
An interesting thing to note about this example is that bundler is downloading all gems to /app/tmp/bundle
.
This is because after the RUN
has finished its cache will no longer be available, so should you want to retain anything from the cache you have to copy it from the cache into the image.
Our Rails application would not be able to run without its gems.
Yarn
# syntax = docker/dockerfile:experimental
...
# Install yarn packages
RUN --mount=target=/app/node_modules,type=cache \
yarn install
# Compile assets
RUN --mount=target=/app/node_modules,type=cache \
bin/rails webpacker:compile
...
In this example we can run bin/rails webpacker:compile
using the mounted node modules cache because we don't need the node_modules
to run the app.
The nice thing about this is that only the compiled assets will end up in the resulting image and not the node_modules
themselves reducing the file size for the final image.
Working Dockerfile
# syntax = docker/dockerfile:experimental
# Stage: Builder
FROM ruby:2.6.5-alpine as Builder
ENV RAILS_ENV production
ENV RACK_ENV production
ENV NODE_ENV production
ENV GEM_HOME=/app/vendor/bundle/ruby/2.6.0
ENV GEM_PATH=$GEM_HOME:$GEM_PATH
ENV PATH=$GEM_HOME/bin:$PATH
ENV BUNDLE_APP_CONFIG=.bundle
RUN apk add --update --no-cache \
build-base \
postgresql-client \
git \
nodejs \
yarn \
tzdata
WORKDIR /app
# Add the Rails app
ADD . /app
# Install gems
RUN gem install bundler
RUN --mount=target=/app/tmp/bundle,type=cache \
bundle install -j "$(getconf _NPROCESSORS_ONLN)" --retry 3 --deployment --path /app/tmp/bundle --without development test \
&& cp -r tmp/bundle/ vendor
RUN bundle config --local path vendor/bundle
# Install yarn packages
RUN --mount=target=/app/node_modules,type=cache \
yarn install
# Compile assets
RUN --mount=target=/app/node_modules,type=cache \
bin/rails webpacker:compile
# Stage: Final
FROM ruby:2.6.5-alpine
RUN apk add --update --no-cache \
postgresql-client \
tzdata
# Copy app with local gems and compiled assets from former build stage
COPY --from=Builder /app /app
ENV GEM_HOME=/app/vendor/bundle/ruby/2.6.0
ENV GEM_PATH=$GEM_HOME:$GEM_PATH
ENV PATH=$GEM_HOME/bin:$PATH
ENV BUNDLE_APP_CONFIG=.bundle
WORKDIR /app
# Expose Puma port
EXPOSE 3000
# Start up
CMD bundle exec puma -C config/puma.rb
Conclusion
This solution is perfect for speeding up Docker builds using dependency managers without having to resort to (slow) workarounds.
The downside though is that experimental features need to be enabled on the daemon and client. Something which will probably not always be possible.
I am by no means a Docker expert so if you have any thoughts / comments I'd love to hear them!
Top comments (0)