DEV Community

Cover image for How to cache node_modules in GitHub Actions with Yarn
Matt Pocock
Matt Pocock

Posted on • Updated on

How to cache node_modules in GitHub Actions with Yarn

The Problem

I run a small team working on a growing monorepo. Every commit, some CI checks run on the entire codebase, from GitHub actions. The checks were taking ~8 minutes to complete. We wanted them to run faster.

We use yarn workspaces to manage dependencies, so a single yarn install at root is enough to install the dependencies for all clients.

Trouble is, this yarn install was taking ~4.5 minutes on the CI. On my local machine, where the node modules have already been saved, this can take as little as 5 seconds. I wanted to speed up the CI.

The first thing I tried

GitHub actions recommends that you cache yarn’s cache. This means you end up with 2 steps that look like this:

- name: Get yarn cache directory path
  id: yarn-cache-dir-path
  run: echo "::set-output name=dir::$(yarn cache dir)"

- uses: actions/cache@v2
  id: yarn-cache # use this to check for `cache-hit` (`steps.yarn-cache.outputs.cache-hit != 'true'`)
  with:
    path: ${{ steps.yarn-cache-dir-path.outputs.dir }}
    key: ${{ runner.os }}-yarn-${{ hashFiles('**/yarn.lock') }}
    restore-keys: |
      ${{ runner.os }}-yarn-
Enter fullscreen mode Exit fullscreen mode

The first step grabs the yarn cache directory path, and saves it. The second step looks for anything stored in the cache, and restores it.

This sped things up a little, but it didn’t reach the heights I was hoping for.

The Solution

Instead of caching the yarn cache, you should cache your node_modules.

- uses: actions/cache@v2
  with:
    path: '**/node_modules'
    key: ${{ runner.os }}-modules-${{ hashFiles('**/yarn.lock') }}
Enter fullscreen mode Exit fullscreen mode

This caches all of your node_modules folders throughout your repository, and busts the cache every time a yarn.lock file changes.

This works for our monorepo, and it should also work for single folder projects too.

This took our install step from ~4.5 minutes to ~30 seconds.

The Full Snippet

name: Automated Tests and Linting

on: [push]

jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v1

      - uses: actions/cache@v2
        with:
          path: '**/node_modules'
          key: ${{ runner.os }}-modules-${{ hashFiles('**/yarn.lock') }}

      - name: Install packages
        run: yarn install

      - name: Autogenerate GraphQL
        run: yarn codegen

      - name: Run Typescript Checks
        run: yarn lint

      - name: Run Tests
        run: yarn test:ci
Enter fullscreen mode Exit fullscreen mode

Top comments (17)

Collapse
 
vamsiampolu profile image
Vamsi Ampolu • Edited

I think we can just use the setup-node action:

- name: Use Node.js ${{ matrix.node-version }}
   uses: actions/setup-node@v2
    with:
      node-version: ${{ matrix.node-version }}
      cache: 'yarn'
Enter fullscreen mode Exit fullscreen mode

The docs say that it uses the cache action underneath and caches the yarn dependencies. See github.com/actions/setup-node/blob...

Collapse
 
sergeykhval profile image
Sergey Khval

it doesn't cache node_modules, only global cache data

Collapse
 
fabianaasara profile image
Fabiana Asara • Edited

I just want to add a little note to this. According to the cache action docs, caching node_modules is not recommended.

I'm saying this because I'm trying to speed up our workflow too and I haven't solved this problem yet other than caching npm.

Collapse
 
thijsnado profile image
Thijs de Vries

I've seen people add the node version as part of the key to mitigate the node version not matching. If you do that and don't use npm ci you are probably safe.

Collapse
 
nk1tz profile image
Nathan • Edited

Great idea.

After implementing your second solution, I noticed the dependency installation was not being shortened.
I ran the CI a few times with no changes. The cache step was taking 0-1s and the proceeding yarn install step always took the same amount of time it used to.

Maybe I'm reaching the limits of cache space? Unlikely as my dependency installation time is much smaller than your 4 minutes.

However your first solution has worked for me, but as you mentioned the savings are minor.

Is there any special you also needed to do alongside solution #2 to make it work?

Thank you,

Collapse
 
chadlavicasebook profile image
Chad Lavimoniere

You all should refer to this example:

github.com/actions/cache/blob/main...

you're still getting a yarn install because you're never checking whether the cache hits. Honestly I don't know how the OP ever got this to work in the first place.

The whole point here is you need to create the cache, then see if it hits, and use whether or not it hits to determine whether or not you should take a given action.

Collapse
 
mattpocockuk profile image
Matt Pocock • Edited

Yes, it's possible you're reaching the end of your cached space. There is a final step at the end of each GH action where it saves the cache. Is this step passing?

Collapse
 
mattpocockuk profile image
Matt Pocock

Also, is your repo a monorepo or a single project repo?

Collapse
 
vashchukmaksim profile image
Vashchuk Maksim

On a second run I got a message:

Cache restored from key: node_modules-7895c53553e57f00e1fd69b577ee8f44d76199b1c231a85c936c5886d7416416

but the next step (yarn install) still takes the same time (almost 2 minutes) and looks like it install everything from scratch.

Why could this happen?

Collapse
 
w4lly profile image
Walter Zalazar • Edited

in the post is missing the if condition, like this

- name: Install dependencies
  if: steps.yarn-cache.outputs.cache-hit != 'true'
  run: yarn install
Enter fullscreen mode Exit fullscreen mode
Collapse
 
mdirshaddev profile image
Md Irshad

Yes in the docs it is also mention to do like this.

Collapse
 
ffxsam profile image
Sam Hulick

@mpocock1 Could you please update your post with this important step?

Collapse
 
bfdes profile image
Bruno Fernandes • Edited

Thank you.

Just to add that -- unlike the other responses -- caching the node_modules folder as illustrated seems sufficient for me.
I find that i) the cache is restored ii) yarn install completes rapidly.
Example: github.com/bfdes/bfdes.in/runs/131...

Collapse
 
elie222 profile image
Elie

This item may help people the most:
stackoverflow.com/a/62244232/2602771

Collapse
 
flydiverny profile image
Markus Maga

Recommend taking a look at github.com/bahmutov/npm-install :) tried to make this effortless!

Collapse
 
sangheestyle profile image
SangHee Kim

I tried bahmutov/npm-install and it took much longer to retrieve cache compared to that of actions/cache.

So, please keep in mind node_modules size when choose one. :)

Collapse
 
flydiverny profile image
Markus Maga

As far as I know bahmutov/npm-install uses the @actions/cache under the hood so the cache retrieval itself should be the same as github actions cache.
But the other steps might still take longer as it doesn't cache the node_modules, like this post suggest as an improvement. So if that's what you compared to I'd expect different results!

As also mentioned in the post the "official" recommendation is to not cache the node_modules folder itself, as that can potentially cause trouble in some scenarios. You can probably avoid those by putting for example the node version in your cache key or so.

github.com/actions/cache/blob/main...

Note: It is not recommended to cache node_modules, as it can break across Node versions and won't work with npm ci

That said I also wish it was a bit faster 😅