The key to building Docker images quickly, across CI providers like Google Cloud Build, is to make use of the previous build's layer cache. There is the theory and best practice for creating a Dockerfile
that takes advantage of layer caching by trying to get as many cache hits as possible during a build. But, in a CI environment, you need to have the layer cache available to the build for that work to pay off.
In this post, we are going to focus on how to build a Docker image as quickly as possible in Cloud Build by leveraging layer caching. We will benchmark build performance with caching using the docker
executor, kaniko
executor, and our own depot
service.
Building Docker images in Cloud Build
Getting an image built inside of Cloud Build can be done with a single step inside a cloudbuild.yml
file. Here is an example where we are building a Node application that has the following Dockerfile
:
FROM node:16 AS build
WORKDIR /app
COPY package.json yarn.lock tsconfig.json ./
COPY src/ ./src/
RUN yarn install --immutable
RUN yarn build
FROM node:16
WORKDIR /app
COPY --from=build /app/node_modules /app/node_modules
COPY --from=build /app/dist /app/dist
ENV NODE_ENV production
CMD ["node", "--enable-source-maps", "./dist/index.js"]
To build this image we add a cloudbuild.yml
file to the root of the repository with the following contents:
steps:
- name: gcr.io/cloud-builders/docker
args:
- build
- -t
- us-west1-docker.pkg.dev/$PROJECT_ID/depot-demo/demo:$COMMIT_SHA
- -t
- us-west1-docker.pkg.dev/$PROJECT_ID/depot-demo/demo:latest
- .
images:
- us-west1-docker.pkg.dev/$PROJECT_ID/depot-demo/demo:$COMMIT_SHA
- us-west1-docker.pkg.dev/$PROJECT_ID/depot-demo/demo:latest
This configuration is telling Cloud Build to build an image using the docker
builder image with two different tags, one for the commit SHA and the other for latest
. Then the images
block tells the build to push those resulting images to artifact registry. Running the build we see the total build takes 1 minute and 40 seconds, with the image build portion taking ~78 seconds.
If you run the build a second time you will notice that the image build is again, approximately 78 seconds. Why? Because we aren't doing anything to make use of the previous builds cache. We can add that by updating our cloudbuild.yml
file to the following:
steps:
- name: gcr.io/cloud-builders/docker
entrypoint: bash
args:
- -c
- docker pull us-west1-docker.pkg.dev/$PROJECT_ID/depot-demo/demo:latest || exit 0
- name: gcr.io/cloud-builders/docker
args:
- build
- -t
- us-west1-docker.pkg.dev/$PROJECT_ID/depot-demo/demo:$COMMIT_SHA
- -t
- us-west1-docker.pkg.dev/$PROJECT_ID/depot-demo/demo:latest
- --cache-from
- us-west1-docker.pkg.dev/$PROJECT_ID/depot-demo/demo:latest
- .
images:
- us-west1-docker.pkg.dev/$PROJECT_ID/depot-demo/demo:$COMMIT_SHA
- us-west1-docker.pkg.dev/$PROJECT_ID/depot-demo/demo:latest
This is the easiest way to leverage the build cache of a previous build in Cloud Build. Here we are pulling down the latest
tag for the image we are building in the first step. We pull that tag down so that we can make use of it in the --cache-from
flag in the second step. This is what allows us to utilize the build cache from the previous build because we tag all new images with the latest
tag.
So, what are the results now? The first build took about 78 seconds to build the image and the second build takes ~38 seconds. A nice improvement, but if you look closely, something doesn't look right.
Did you spot it? We got the docker build
portion down to 38 seconds, but the entire build still took ~78 seconds. Why is that? Well, it's because pulling the latest
tag to use in the --cache-from
takes time to transfer that image from the registry to your build so that it can be used for caching. In this case, that took 25 seconds and has negated any benefit we could have seen from using the layer cache in total build time.
Building Docker images in Cloud Build with Kaniko
kaniko
is a tool that allows you to build container images inside Kubernetes without the need for the Docker daemon. Effectively, it allows you to build Docker images without docker build
.
We can actually change our cloudbuild.yml
file to use kaniko
instead of the docker
builder image. With the Kaniko executor in Cloud Build, we can specify a --cache
flag that allows us to store our Docker layer cache in Container Registry. Here is the updated cloudbuild.yml
file:
steps:
- name: gcr.io/kaniko-project/executor:latest
args:
- --destination=gcr.io/$PROJECT_ID/depot-demo/demo
- --cache=true
- --cache-ttl=24h
If we run a build with this configuration, we see the following results:
The entire build took 2 minutes and 30 seconds, and the image build portion took 2 minutes and 19 seconds of that. That's not ideal, but maybe build performance will be better for the next build because we can make use of the layer cache via Kaniko. Let's run the build again and see what happens:
On the second run, the image build is now ~69 seconds and the entire build is 79 seconds. An improvement over the previous run because we get to make use of caching, but we aren't seeing any improvement over our Docker builder approach. In fact, the total time is effectively the same and the image build is slower. To recap, here are the results we have seen so far:
total time | image build time | |
---|---|---|
with docker builder (no cache) |
100s | 78s |
with docker builder (with cache) |
78s | 38s |
with kaniko builder (no cache) |
150s | 139s |
with kaniko builder (with cache) |
79s | 69s |
Faster Docker image builds in Cloud Build with Depot
We've observed that using the Docker layer cache across builds speeds up build times significantly. But, as we saw, the current approach for doing that in Cloud Build can negate any performance gains because of network latency. The image build might take 38 seconds, but the entire build still takes a total of 78 seconds to complete because it takes another 25 seconds to pull down the latest
tag to use for caching.
What if we could make use of the layer cache without the network penalty? That is where Depot comes in. Depot provides remote container builders on cloud VMs. They come with more resources, 4 CPUs and 8 GB memory, as well as a 50 GB persistent SSD cache. A large, fast, and persistent disk allows us to share layer cache across builds automatically, without spending any time transferring the cache for the build.
We can use Depot to build our image in Cloud Build by using the depot
builder image. Here is the updated cloudbuild.yml
file:
steps:
- id: Build with Depot
name: ghcr.io/depot/cli:latest
args:
- build
- --project
- <your-depot-project-id>
- -t
- us-west1-docker.pkg.dev/$PROJECT_ID/depot-demo/demo:$COMMIT_SHA
- .
- --push
env:
- DEPOT_TOKEN=${_DEPOT_TOKEN}
This configuration uses the depot
builder image to build the image. The --project
flag routes your build to your Depot project and the remote builders that back it, using the DEPOT_TOKEN
environment variable to authenticate the build to your project. Note, the token used here is a project token that can be created under your project settings.
If we run our first build using this configuration, we see an output like the one seen below:
We can see that the first build is uncached and takes a total of 73 seconds to complete, with 64 seconds of that being the image build. Things are already faster when looking at total build time, than any of the other previous options. Let's run a second build that leverages the persistent SSD cache on the remote builders. We don't have to make any changes to leverage the layer cache, it's already on the remote builder.
The entire build took 28 seconds, and the image build portion was 15 seconds. Depot is over 2x faster at building Docker images inside of Google Cloud Build than any of the previous options we looked at.
total time | image build time | |
---|---|---|
with docker builder (no cache) |
100s | 78s |
with docker builder (with cache) |
78s | 38s |
with kaniko builder (no cache) |
150s | 139s |
with kaniko builder (with cache) |
79s | 69s |
with depot builder (no cache) |
73s | 64s |
with depot builder (with cache) |
28s | 15s |
Conclusion
In this post, we have seen the variety of different ways that you can build Docker images in Google Cloud Build. A fundamental component to making image builds fast is making use of the layer cache from previous builds.
As we saw, with docker
we can use the layer cache by pulling down the previous image and using it as a cache source. This produced an image build that was twice as fast, but the total build time remained roughly the same because of the network penalty of pulling down the previous image.
With kaniko
we got the ability to use the layer cache by persisting it to Container Registry. But the image build, for cached and uncached, wasn't any faster than using the docker
builder image. Kaniko caching is slower as it snapshots the filesystem after each layer and there is still a network penalty being paid to transfer the layer cache from Container Registry.
With depot
we get the ability to use the native Docker layer cache without the network penalty. The image build is over 2x faster than using the docker
approach and almost 3x faster than using the kaniko
approach. The layer cache is persisted to a fast SSD on the remote builders, allowing subsequent builds to be faster by using it automatically.
Are you interested in trying Depot for your own projects? We offer a 14-day trial! Get started for free 🎉
Top comments (3)
This link seems dead - dev.to/blog/fast-dockerfiles-theor..., or there's typos?
Using cache is great to improve build time, but then it won't use the last updated base image. Do you have any solution to ensure we get those update when available/needed?
Awesome question! It sounds like your referring to referring to things like
FROM ubuntu:latest
, where you want to make sure you get the latest base image. Thelatest
image will still be pulled in a build if it has changed and then the layer cache will go to work for subsequent steps. Here is an example Dockerfile.If I run that build with
depot
, you see something like this at the start of the build.The key bit is the
load metadata for docker.io/library/node:16
which is followed by aFROM
with a sha hash of that image tag. If we change theFROM
fromnode:16
tonode:16.8.1
, you can see that the new base image gets pulled but the layer cache still takes effect.Essentially the Docker layer cache is formed by the
ADD
,RUN
, andCOPY
statements. So the latest base images are always pulled if they have changed.