Build hacks - Faster Ember builds with Docker on Windows

#docker #javascript #node #ember

When I joined a team maintaining an Ember web app, I was surprised to learn that almost the whole team developed exclusively on MacBooks. The team experienced slow Ember builds on Windows, and dealing with native Node dependencies (such as node-gyp) was a frustrating experience. Microsoft has made some recent improvements to support Node-based development environments on Windows, so I set out to see what we could do to make this better.

Note: WSL2 has been announced, which resolves many of the performance pains we experienced. This post should still be relevant for those wanting to use Docker as a development container.

Just show me the code!

A working demo of the Docker setup is available on GitHub. We'll link to it throughout this article.

Why are builds so slow?

Ember's build pipeline creates a lot of temporary files, which we confirmed by using Process Monitor. Our suspicion was that the Windows NTFS filesystem itself has more overhead than other platforms, and creating a bunch of temporary files on disk and then reading them is where our main bottleneck was.

An example of some of the temporary files created during a build:

Our first approach to speed up builds was to leverage the Windows subsystem for Linux (WSL), which simulates a Linux environment without using a VM. You can find more detail here for how the filesystem mapping works, but the important part is the host's native filesystem is still used to store the underlying files (NTFS).

A screenshot of local filesystem activity running builds under WSL:

We confirmed our expectation that builds would be as slow as they were on a native Windows environment, so we moved on to other options. Our next step was to get the build workspace out of NTFS entirely, which meant using some kind of VM. Docker for Windows turned out to be a great fit for this.

What we needed

An easy setup for all Windows developers on the team. The only requirements on the host should be Docker and .NET Core.
Avoid (where possible) native dependencies on the host (such as build-essential or node-sass bindings)
A running dev server in the container (ember serve in this case) that can be notified when files change, which serves built assets over HTTP
Very fast access to read and write a bunch of temporary files

Configuring the container

We settled on running the entire Ember build pipeline within Docker and using the container's Linux-based filesystem, with some scripts to sync over just the application source from the host workstation. Let's go into detail on how this was accomplished.

Tools used:

Docker exposes the application source via a shared /host-app mount. This is always in sync with the host, but it's a poor place for temporary files, since it's exposed as a SMB mount point. At container start, the source is copied from the host to a directory within the container's filesystem in /app, and then the build process runs. It's important that the node_modules restore happens within the container and not over the shared mount so that the build has fast access to its dependencies. Passed in docker-cli arguments can be used via --build-arg to control steps run during build process, such as doing an initial unit test run.

Notifying the container of updates

Tools used:

The /host-app mount does not raise notifications when files change, so we need a way to sync over changes to the container's /app directory. We could use polling but that's slow and uses unnecessary CPU time, so instead we built a tool that simulates file change notifications from the container host. The DockerVolumeWatcher tool uses the Windows Filesystem APIs to watch for all files changed within directories that are mapped to containers via host mounts, ignoring anything listed in .dockerignore.

When an a file is changed, chmod is run within the container on the file that was changed (via chmod $(stat -c %a {filepath}) {filepath}) to raise the file changed event to the container's running processes. This hack works well for this case, as it doesn't actually modify the file contents on the host. Using a tool like touch would trigger another file modification event, which we don't want here. From here, a simple mirroring tool can be used (such as lsync) to copy over the changed source from /host-app to app.

Making the developer experience even better

Building containers creates a lot of artifacts, and after a few days of building new images, the Docker filesystem may run out of space. To counter this, we made a Powershell script as a part of starting up the dev environment that does a few things:

Start DockerVolumeWatcher
Clean up containers and images older than 24 hours
Sanity check that the FS watcher is working by creating a file on the host and checking for its existence via docker exec

You can check out the source for the script here.

Rough edges

This setup works well but requires a few workflow changes. For some VS code plugins, a recent version of Node is required for linting support. Package updates also require attaching to the container, running yarn add <package>, and copying over the changed manifest with cp /app/package.json /host-app/package.json (same with the lockfile). Rebuilding the container after packages have been updated is also slower than native package updating, as the container is starting from a fresh state. To work around this, you can create a "delta" and run package restore twice:

COPY --chown=user:user ./package-base.json ./package.json
COPY --chown=user:user ./yarn-base.lock ./yarn.lock

# Restore initial packages (cached in future container builds)
RUN yarn

COPY --chown=user:user ./package.json .
COPY --chown=user:user ./yarn.lock .

# This should be very fast, since it only restores missing packages
RUN yarn

Switching branches on the host also does not work very well, as hundreds of file notifications are generated at once. Sometimes the container has to be re-started to get back into a good state.

How fast is this, really

Results taken using a median after 5 passes, on an Intel Xeon E-2176M processor with 32 GB RAM and SSD.

The build was run with administrative privileges so the Ember build could use symlinks to speed up the build. More info here

Environment	Package restore	First build	Watch-mode rebuild
Windows native	67.51s	120.04s	6.017s
WSL	164.67s	208.13s	33.52s
Docker container	118.81s	70.61s	0.68s

Bonus: Containers for continuous integration builds

Many CI services support Dockerfile as the build recipe, such as Github Actions and Travis. If your build requires complicated setup steps, such as installing a specific version of Chrome or creating symlinks to other folders, using a Dockerfile can prevent the need to synchronize commands between CI scripts and local dev scripts.

Thanks for reading!

This was a fun experiment to see how fast we could get local builds. We're also testing out the Remote Containers extention for VS Code, and we look forward to using WSL2 when it releases in June 2019 to see how we can simplify this setup without sacrificing speed!

If you made it this far, consider getting involved with an OSS project you use on a daily basis. Chances are they could use a hand updating documentation, tests, or fixing some bugs. The .NET Foundation project list is a good place to start if you're looking for projects that need help.

Cheers 🍻

I'm on Twitter @dustinsoftware

Thanks to Tamar Kornblum and Frank Tan for reviewing earlier drafts of this post.