DEV Community

Ivan Starkov
Ivan Starkov

Posted on

Use docker image tagging to speedup the builds.

From the beginning, we did a review deployment of each commit of each pull requst.
Initially it was two builds/deploys and accordingly you could not think about caching.
Over 4 years this has grown to 15+ builds/deploys and only the build time has become 20+ minutes.

After some reading we came across docker buildx and decided to use the features it provides,
namely cache-from and cache-to

Idea was to create some checksum for every subproject, then use that checksum for cache-from, cache-to source/destination.
Like it's described here

In our case it worked but did not give a very significant advantage. Given the number of different builds, we got that the time to download all the cache was comparable to the build time.

Also having that we use yarn based monorepo to reuse docker layering we created internal prune util which is not that simple and adds additional steps to the build. (about prune you can read at amazing turborepo)

Our builds are very simple, +-fast and consist of one step, there are a lot of them and so we did not get the significant advantage we expected.

We decided not to use the cache, but to take advantage of the fact that a developer only works on one or two projects in one pull request. So we don't need to rebuild everything, but only the projects that have changed.

And thanks to docker it allowed to do everything we need.

  • Single docker image can have mutiple tags.
  • We can add tag to docker image without pulling it and gcloud even have ready command for this gcloud container images add-tag
  • We already had workspace checksum util from our previous work iteration with cache layers.

Now in pseudocode our build process for each subproject on every commit (yarn workspace) looks like.

CHECKSUM=$(generate-workspace-checksum ${WORSPACE_NAME})
if gcloud container images describe ${IMAGE_NAME}:${CHECKSUM}; then
  gcloud container images add-tag --quiet ${IMAGE_NAME}:${CHECKSUM} ${IMAGE_NAME}:${SHORT_SHA}
else
  docker buildx build . \
    --file=universal.Dockerfile \
    --tag=${IMAGE_NAME}:${SHORT_SHA} \
    --tag=${IMAGE_NAME}:${CHECKSUM} \
    --progress=plain ${BUILD_ARGS} \
    --push
fi
Enter fullscreen mode Exit fullscreen mode

We generate workspace checksum, check if image with tag equal to checksum exists, and if yes we add commit sha tag to existing image. This is really fast operation.
In case if image is not exits we just rebuild workspace and tag image with 2 tags - checksum and commit sha.

Deploys were already fast and executed in parallel so doing 2 or 15 deploys in our case doesnt matter a lot.

Trick above allowed us to significantly reduce build times, sometimes to seconds (like documentation has changed) instead of 20+ minutes for every commit.

For projects with mutibuild steps, various dependency installs etc above solution could not work, and caching would be the best solution. For us just image retagging works the best, removes the need of some external KV storage (sha => checksum), significantly removed build times and simplified builds.

Discussion (0)