DEV Community

Cover image for How to cache node_modules in GitHub Actions with Yarn
Matt Pocock
Matt Pocock

Posted on • Updated on

How to cache node_modules in GitHub Actions with Yarn

The Problem

I run a small team working on a growing monorepo. Every commit, some CI checks run on the entire codebase, from GitHub actions. The checks were taking ~8 minutes to complete. We wanted them to run faster.

We use yarn workspaces to manage dependencies, so a single yarn install at root is enough to install the dependencies for all clients.

Trouble is, this yarn install was taking ~4.5 minutes on the CI. On my local machine, where the node modules have already been saved, this can take as little as 5 seconds. I wanted to speed up the CI.

The first thing I tried

GitHub actions recommends that you cache yarn’s cache. This means you end up with 2 steps that look like this:

- name: Get yarn cache directory path
  id: yarn-cache-dir-path
  run: echo "::set-output name=dir::$(yarn cache dir)"

- uses: actions/cache@v2
  id: yarn-cache # use this to check for `cache-hit` (`steps.yarn-cache.outputs.cache-hit != 'true'`)
  with:
    path: ${{ steps.yarn-cache-dir-path.outputs.dir }}
    key: ${{ runner.os }}-yarn-${{ hashFiles('**/yarn.lock') }}
    restore-keys: |
      ${{ runner.os }}-yarn-
Enter fullscreen mode Exit fullscreen mode

The first step grabs the yarn cache directory path, and saves it. The second step looks for anything stored in the cache, and restores it.

This sped things up a little, but it didn’t reach the heights I was hoping for.

The Solution

Instead of caching the yarn cache, you should cache your node_modules.

- uses: actions/cache@v2
  with:
    path: '**/node_modules'
    key: ${{ runner.os }}-modules-${{ hashFiles('**/yarn.lock') }}
Enter fullscreen mode Exit fullscreen mode

This caches all of your node_modules folders throughout your repository, and busts the cache every time a yarn.lock file changes.

This works for our monorepo, and it should also work for single folder projects too.

This took our install step from ~4.5 minutes to ~30 seconds.

The Full Snippet

name: Automated Tests and Linting

on: [push]

jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v1

      - uses: actions/cache@v2
        with:
          path: '**/node_modules'
          key: ${{ runner.os }}-modules-${{ hashFiles('**/yarn.lock') }}

      - name: Install packages
        run: yarn install

      - name: Autogenerate GraphQL
        run: yarn codegen

      - name: Run Typescript Checks
        run: yarn lint

      - name: Run Tests
        run: yarn test:ci
Enter fullscreen mode Exit fullscreen mode

Discussion (9)

Collapse
nk1tz profile image
Nathan • Edited

Great idea.

After implementing your second solution, I noticed the dependency installation was not being shortened.
I ran the CI a few times with no changes. The cache step was taking 0-1s and the proceeding yarn install step always took the same amount of time it used to.

Maybe I'm reaching the limits of cache space? Unlikely as my dependency installation time is much smaller than your 4 minutes.

However your first solution has worked for me, but as you mentioned the savings are minor.

Is there any special you also needed to do alongside solution #2 to make it work?

Thank you,

Collapse
chadlavicasebook profile image
Chad Lavimoniere

You all should refer to this example:

github.com/actions/cache/blob/main...

you're still getting a yarn install because you're never checking whether the cache hits. Honestly I don't know how the OP ever got this to work in the first place.

The whole point here is you need to create the cache, then see if it hits, and use whether or not it hits to determine whether or not you should take a given action.

Collapse
mpocock1 profile image
Matt Pocock Author • Edited

Yes, it's possible you're reaching the end of your cached space. There is a final step at the end of each GH action where it saves the cache. Is this step passing?

Collapse
mpocock1 profile image
Matt Pocock Author

Also, is your repo a monorepo or a single project repo?

Collapse
vashchukmaksim profile image
Vashchuk Maksim

On a second run I got a message:

Cache restored from key: node_modules-7895c53553e57f00e1fd69b577ee8f44d76199b1c231a85c936c5886d7416416

but the next step (yarn install) still takes the same time (almost 2 minutes) and looks like it install everything from scratch.

Why could this happen?

Collapse
wzalazar profile image
Walter Zalazar • Edited

in the post is missing the if condition, like this

- name: Install dependencies
  if: steps.yarn-cache.outputs.cache-hit != 'true'
  run: yarn install
Enter fullscreen mode Exit fullscreen mode
Collapse
elie222 profile image
Elie

This item may help people the most:
stackoverflow.com/a/62244232/2602771

Collapse
bfdes profile image
Bruno Fernandes • Edited

Thank you.

Just to add that -- unlike the other responses -- caching the node_modules folder as illustrated seems sufficient for me.
I find that i) the cache is restored ii) yarn install completes rapidly.
Example: github.com/bfdes/bfdes.in/runs/131...

Collapse
flydiverny profile image
Markus Maga

Recommend taking a look at github.com/bahmutov/npm-install :) tried to make this effortless!