DEV Community

Valentin Sawadski (he/him)
Valentin Sawadski (he/him)

Posted on

Do you use caching in your CI/CD pipelines?

According to this study only ~20% of GitHub actions use caching to speed up builds! ๐Ÿคฏ

Only 1729 out of 7962 projects use caching

Seems incredibly low and wasteful to me, so I was wondering. How many of you donโ€™t use caching in their projects? And what are the reasons why not?

Discussion (12)

Collapse
ahferroin7 profile image
Austin S. Hemmelgarn

I do in some cases, but not in others.

If itโ€™s something like caching dependencies and the dependency management tooling includes strict versioning (for example, using NPM and making use of package-lock.json instead of just package.json) then yes, I generally do use caching.

OTOH, if there is no way to determine automatically that the cache should be dropped, I tend to avoid caching. Without automatic cache invalidation, you end up in a situation where you canโ€™t be sure that the CI is actually testing the correct thing, or that any failures are actually true failures. This type of thing is a known and well established issue with caching in build environments (for example, standard advice when you run into a build issue while using ccache has always been to manually drop the cache and rebuild things).

Collapse
vsaw profile image
Valentin Sawadski (he/him) Author

To be honest I was only thinking of npm and Docker build caches, things where I never really ran into any problems with invalid caches. But I see your point.

What is the main reason you invest in caching? Speed or Cost?

Collapse
ahferroin7 profile image
Austin S. Hemmelgarn

To be honest I was only thinking of npm and Docker build caches, things where I never really ran into any problems with invalid caches. But I see your point.

Actually, Docker build caching is one potential source of issues, because you can only safely use caches if all parts of your build process (and all underlying images) are nondeterministic and invariant of all external resources. For example, if you are building something off of an Ubuntu base image and running apt-get update && apt-get upgrade -y as part of the build, you actually canโ€™t safely use Dockerโ€™s built-in cache because of how it handles invalidation (put differently, if you build that same Docker image with a clean cache at two different points in time, you can end up with two different images, which means you canโ€™t safely use caching).

What is the main reason you invest in caching? Speed or Cost?

Speed primarily, because the difference can be huge, and none of the build infrastructure I use charges for time.

Thread Thread
vsaw profile image
Valentin Sawadski (he/him) Author

True, you should always use explicit versioning in your Dockerfile, to have reproducible builds.

But even without it, caching should work and speed up things and the build should be able to work for as long as you keep the cache. At least I donโ€™t see a reason why caching unversioned commands should break the build.

I do see the point that, if the cache gets dropped for whatever reason, you may end up with a different build as different version of the dependencies may get installed (which you probably donโ€™t want if you aim for reproducible builds).

Thread Thread
ahferroin7 profile image
Austin S. Hemmelgarn

The flip side though is that sometimes you want (or even need) to always be using the latest versions of dependencies. For example, where I work we use Docker as part of our process of building native DEB/RPM packages for various distros (because it lets us make the process trivially portable), and in that case, we always want to be building against whatever the latest versions of our dependencies are so that the resulting package installs correctly.

In such a situation, caching the Docker build results can cause that requirement for tracking the latest dependency versions to be violated.

Collapse
vsaw profile image
Valentin Sawadski (he/him) Author

I personally set up caching only for pipelines that take โ€œtoo longโ€ (more than 15 minutes).

I think itโ€™s mostly because the next feature is always more important then tweaking the CI/CD.

But thinking about it, this feels like โ€œskipping TDDโ€. In the end the benefits in terms of cost savings and developer productivity seems the outweighs the couple of minutes it takes to set up caching and other improvements (Iโ€™m looking at you Dockerfile ๐Ÿ‘€)

Collapse
codewander profile image
codewander

Some caches can be slow to upload and download from, resulting in slower overall build time. Also, each caching step doubles the number of conditional branches that you need to verify after making a change to your CI script.

I would try caching in each case and measure difference in build time.

I have been leaning more towards having developers run tests locally, produce a text file with a hash based on source files of the last test run, commit the text file into repo, and have the PR builds just verify that last test run matches source code. Then have PR merge build do a full build as a last minute check of reproducibility.

Collapse
vsaw profile image
Valentin Sawadski (he/him) Author

Yeah Iโ€™ve noticed that getting and updating cache can take 1-2 minutes, depending on the size of the cache. But given a fairly large list of dependencies itโ€™s almost always worth it.

As for changes in CI scripts: Canโ€™t you set up cache to re validate when it detects changes in key files, such as package.json.

The idea of running Tests locally and uploading proof is interesting. But Iโ€™d be worried about having uncontrollable environments and side effects. Centralized CI allows for a โ€œsource of truthโ€. Or how do you manage your developer environments?

Collapse
codewander profile image
codewander

Centralized CI allows for a โ€œsource of truthโ€. Or how do you manage your developer environments?

I suspect most people use CI to just ensure all devs are running the tests...for that case, I would just use the check I mentioned, along with scripts that encourage runnings tests locally with minimum friction. As mentioned, I still would use CI as a final check after PR review is complete, but not on each change in a PR.

The next level is people trying to ensure the tests run from a clean state. You can also encourage this with scripts to clean local environment state. Generally, most languages have some sort of sandbox or dependency management, so builds are relatively isolated and reproducible.

The next level would be people ensuring the build works on multiple operating systems or on a blessed operating system. At that point, CI can help if devs use macos and blessed operating system is linux.

A final level is when people want to speed up their integration tests by split and running in parallel. For this, I would use CI.

Thread Thread
vsaw profile image
Valentin Sawadski (he/him) Author

As mentioned, I still would use CI as a final check after PR review is complete, but not on each change in a PR.

For me it's the other way around, I want automatic tests to pass before I review the PR, because developer time is more valuable then CI time.

You can also encourage this with scripts to clean local environment state.

Sure you can clean the git state and and build directory, however this could still mean that developers run different compiler versions, or different HW architecture (Intel vs Apple Silicon), which essentially makes very hard, if not impossible to have reproducible builds and tests. Which is why I prefer having a fast CI setup as gatekeeper and never release anything that has not been built by the CI server.

Thread Thread
codewander profile image
codewander

having a fast CI setup as a gatekeeper

I just vote for fast local tests (I would probably wrap git push with an alias that ran unit tests && git push instead and ask everyone to use the alias), along with a final check before merging. If you keep thinking through all the scenarios that make CI tests slower than local dev, you will ultimately notice that there are multiple scenarios where CI will always become the bottleneck in your code workflow. One scenario is when you modify the version of one of your external dependencies. The tests will run quickly locally, but CI will have a busted cache and need to fetch all the dependencies again and possibly upload the new set to a cache. This will happen on a regular basis. Developers will complain that CI can become slow sometimes.

Collapse
codewander profile image
codewander

re validate when it detects changes in key files

I just meant that you have to verify this logic works as you expect, which usually means running the ci build twice to try out the two branches.