Nico Clack

Posted on Sep 13, 2023 • Updated on Sep 20, 2023

Best Practices for Service Workers in 2023

#pwa #javascript #webdev #svelte

Note: This article was renamed from "Everything I Learnt From Making a Service Worker Build Plugin".

Since I've now finished my 3rd service worker build plugin, I thought I'd share what I learnt, since I had to figure out a surprising amount by myself. In addition, this post also compiles and expands on the suggestions I thought were relevant from the following:

"How to Fix the Refresh Button When Using Service Workers" by Dan Fabulich
"A guide to Service Workers - pitfalls and best practices" by Ayman Farhat
"Service Workers Tips" by Matteo Mazzarolo
"Handling updates offline-first - HTTP 203" published by Google Chrome Developers

Quick Explanations

If you're trying to make your own build plugin, you'll need to know the APIs in more detail, but here's a quick rundown just so you can understand this post...

Service workers intercept requests from the clients (tabs and windows) they control. When a service worker is first downloaded or when it's updated, it enters the "installing" stage, firing the "install" event. Once that event finishes (and doesn't throw), it becomes "waiting". If there's no existing active service worker, this one will become it*, but otherwise it'll wait until there are no clients before becoming active. This means you can have both an active and a waiting worker simultaneously. However, you can have the waiting service worker call skipWaiting, which is what some of this post is about.

*But it won't control the existing clients unless clients.claim is called.

With that all out of the way, here's everything I learnt and what I now consider to be best practices for service workers...

#0: They're Complicated

While the basics of the Service Worker API are relatively simple, things get complicated once you start trying to do things properly. Because of this, you'll likely end up making your own general service worker build plugin, rather than something specific to a particular project. This is probably a bit more sensible than making your own JavaScript framework, and you'll learn a lot, but just keep in mind that it could take months (my most recent took about 4).

If you can, use something like Workbox or, if it's suitable, my own: Versioned Worker (wink wink nudge nudge).

#1: Version Your Cache

Starting with a simpler thing: make sure you use a unique ID for each version of your app you store with the Cache API. In Versioned Worker, I have the ID follow this format:

{the prefix set in the config}-{the app version}

Alternatively, you could generate a random string at build time instead of having a version number.

Each service worker should then only serve from its own cache to avoid issues from mismatched versions. Remember that once a new service worker is installed, there'll usually still be another service worker that's active, and the new worker shouldn't interfere with it.

This approach also makes it easier to remove old cache as you can just remove all other cache objects (not the current one) that have your prefix.

#2: Put Your Service Worker in the Right Folder

In my, and possibly also Mozilla's opinion, the importance of the "scope" argument in navigator.serviceWorker.register is generally overstated. Since you can only use it to narrow the scope of a service worker, I'd instead think about what folder* you're going to have the service worker file in. Most of the time, it should be at the root of your site, meaning it has a URL like example.com/service-worker.js. However, if some routes have separate codebases, like if you're using GitHub pages, you probably want it to look more like this:

example.com/Text-Editor/service-worker.js, which controls tabs with URLs starting with example.com/Text-Editor
And example.com/Cool-Game/service-worker.js, which controls tabs with URLs starting with example.com/Cool-Game

*Yes, I know URLs don't really have folders, but I think it's a good way to think about this.

#3: Service Workers Intercept All Requests From Controlled Clients

Somewhat embarrassingly, I only recently found out that requests outside of the service worker's scope (including cross origin requests) are still intercepted. There are some restrictions when it comes to CORS, but the Service Worker API really does give you a lot of control. It turns out that the scope actually just dictates what clients are controlled. And all requests from controlled clients are intercepted.

#4: Take Advantage of Reload Opportunities

Moving onto a more complicated one: in an SPA, you might be able to sneak in a reload for an update as part of certain actions. Pretty much any in-app navigation can work, but think about how much data you'll need to transfer and how noticeable it'll be to users. The client should also check with the service worker that it's the only client, since you don't want to have 2 versions running at once. In Versioned Worker, this process looks like a bit like this:

reloadOpportunity is called, optionally with a URL to navigate to as part of it and some state to give to the new version.
If there's no waiting service worker, this process stops.
It asks the service worker how many clients it has.
If there's more than 1, this process stops.
The client then sets a SessionStorage item to mark there being some state and sends the state to the waiting service worker. Alternatively, you can just store the state in SessionStorage.
The waiting service worker then holds onto the state by using ExtendableMessageEvent.waitUntil in the message listener and calls skipWaiting. This should ensure that the service worker remains running until the state is requested again in a second.
The client either receives a message from the service worker telling it to reload, or a "controllerchange" event is detected, triggering a reload.
The client reloads using the new version and notices the SessionStorage indicates there's some state.
It asks the now active service worker for the state, then uses the data to put the app back in a similar state to before.
The promise provided to the waitUntil resolves, allowing the browser to shut down the service worker again*.

*See #5.

And in MPAs, this is even easier, provided you implement #10.

#5: When the Service Worker Starts and Stops

Service workers almost always start because of an event they received (that might even be the only time?). They can be stopped anytime there aren't any ongoing ExtendableEvents, but browsers will generally let them idle for a bit while they're controlling clients. Browsers will typically kill misbehaving service workers though. For example, Chrome will kill them if either of the following happen:

The main thread of the service worker is blocked for 30 seconds or more
An ExtendableEvent runs for more than 5 minutes*

*Though, I wonder if Chrome allows that if there are some other ongoing ExtendableEvents that are currently under that limit.

Rather confusingly, a lot of resources state service workers "live" (they also have the quotes) forever, referring to how they're saved until the site data is cleared. So let me clear this up: saving in this context just means that the browser stores the necessary information to restart the service worker, sometimes involving a bytecode cache. It doesn't mean that the state of the service worker will be preserved, and therefore all the top-level code will be re-run.

More information: Service worker FAQ

#6: Wait Until the Page has Loaded/Hydrated Before Registering

I haven't experimented too much with this one, but in Versioned Worker I do wait until hydration has finished before registering. This is so the service worker doesn't compete with the page for bandwidth and CPU time. Remember that the service worker isn't used for the very first page load anyway.

#7: How the Service Worker Itself is Cached

The request for the service worker itself isn't intercepted by the active service worker. The same also applies to its imports (both importScripts and import) and fetches. Therefore, you should avoid storing all but potentially the latter in CacheStorage.

Starting a few years ago, all the major browsers decided to ignore the cache headers for the service worker entry. Instead, the whole file is requested (unless updateViaCache is set to "all") and compared byte-by-byte.

Because of this, and also because doing so breaks things, you should avoid changing the service worker filename.

As a failsafe, the browser will also treat the HTTP cache for it as stale if it's more than 24 hours old. This also seems to apply to imports, but I haven't been able to fully verify this (the browser might just be freeing up space).

#8: Use an importScripts Wrapper

Because of how the caching works, it's a good idea to use an importScripts wrapper for your service worker to improve the efficiency of update checks. The code looks like this:

// service-worker.js
importScripts("assets/innerSw.<computed hash>.js");
/* ^ You can also use a search parameter to cache bust, like this:
  "assets/innerSw.js?v=42"
*/

// assets/innerSw.<computed hash>.js
/* <the actual service worker code> */

Then make sure you've left the updateViaCache option of navigator.serviceWorker.register as its default value of "imports". The browser will now only download service-worker.js to check for updates.

Tip: If you can, set your server to cache the inner service worker indefinitely as we've got cache busting. Note that the HTTP cache will be ignored by the browser for the outer service worker, unless updateViaCache is set to "all".

#9: Don't Require a Service Worker to be Active

Despite browser support for service workers being good for the past few years, they should still be a progressive enhancement. This is because they aren't/can't be used in the following situations:

The first load. Although if you have to, you can wait for it to install and then have it claim the client.
Hard reloads.
In Firefox private windows.
In Firefox if cookies are disabled or set to be cleared on quit.

#10: How to Fix the Browser's Reload and Work Around Other Quirks

Dan Fabulich already wrote a whole article just on the first bit, but I've got some more to add.

The problem is that when you reload a webpage, the new page is loaded before the old one is discarded. Because of this, the browser doesn't swap out the active service worker with the waiting one.

The main oversight in his guide is that during the first navigation into the scope of a service worker, the client isn't in clients.matchAll. This means that opening a 2nd in-scope tab when there's a waiting service worker will cause the 1st one to refresh*!

*If the page is set to reload when the controller changes, which I do in Versioned Worker.

It also doesn't work in Firefox due to a bug that still hasn't been fixed. Thankfully, this can be worked around by allowing the page to be reloaded for an update within 300ms of it loading. This is pretty unnoticeable to users and surprisingly even means the PWA can update more smoothly in Firefox than Chrome, as Chrome takes about a second to reload the page. Removing the logic around checking for a waiting service worker from the active worker might fix this, but I haven't experimented with that yet.

The other thing is that with his code, the page can get stuck loading. At least in Chrome, there just seems to be some weirdness with calling skipWaiting when there are ongoing requests. Because of this, I suggest calling it every 100ms until the returned promise resolves.

Anyway, in Versioned Worker, this all looks like this:

When the active service worker receives a GET request with a mode of "navigate", it checks if there's a waiting worker
If there is one, it checks if there's 1 client or less with clients.matchAll()
If that condition is also met, it sends the reload page (more on that in a second)
Otherwise it sends the page as normal
If, on a non-reload page, there's a waiting service worker within 300ms of the page loading, tell the waiting service worker to potentially skip waiting
If the waiting service worker gets the message, it checks if there's 1 client or less
If that condition passes, skipWaiting is called and awaited with a timeout of 100ms
If it times out, the active service worker is told to finish and the client is told to reload
Because the active service worker was told to finish, the client is sent the reload page

Then the code for the reload page is similar to this:

<script>
    (() => {
        onload = () => {
            navigator.serviceWorker.getRegistration().then(async registration => {
                if (! registration?.waiting) {
                    return reloadOnce();
                }

                sendConditionalSkip();
                while (true) {
                    const start = performance.now();
                    const event = await new Promise(resolve => {
                        navigator.serviceWorker.addEventListener("message", resolve, {
                            once: true
                        });
                        setTimeout(() => resolve(null), 500);
                    });

                    if (
                        event?.data?.type === "vw-skipFailed" // More than 1 client
                        || (! registration.waiting)
                    ) {
                        return reloadOnce();
                    }

                    sendConditionalSkip();

                    await new Promise(resolve => {
                        setTimeout(() => resolve(), 100 - (performance.now() - start));
                    });
                }

                function sendConditionalSkip() {
                    registration.waiting?.postMessage({
                        type: "conditionalSkipWaiting"
                    });
                }
            });
            navigator.serviceWorker.addEventListener("controllerchange", reloadOnce);


            let reloading = false;
            function reloadOnce() {
                if (reloading) return;
                reloading = true;

                location.reload();
                setInterval(() => location.reload(), 1000);
            }
        }
    })();
</script>

Note: The "conditionalSkipWaiting" message is the same skip waiting message that was mentioned earlier. However, the active service worker won't be told to finish as part of it. Also note that if the waiting service worker tells this page that there's more than 1 client, the page will reload without an update, fixing that flaw I mentioned earlier.

Due to back/forwards cache, it's also important that the reload page is sent with the cache-control header set to "no-store". You should also have some code to reload the page again if the user is sent back to the old page after the app was reloaded for an update. Versioned Worker's implementation of this is a bit more complicated as it has some timeouts and other logic, but here's a basic approach:

let reloading = false;
function reloadOnce() {
    if (reloading) return;
    reloading = true;

    location.reload();
    window.addEventListener("pageshow", () => location.reload());
}

Hopefully that tip was still of some use to you considering the complexity. If anyone's interested, I could see if I can make a proper tutorial on it.

#11: Don't Cache non-oks

With the longest one now done, here's a simpler one... Remember that if you're using Cache.put that fetch doesn't throw for error responses. So, check if the ok property is true and handle things accordingly. If this is in an "install" event listener, you might want make the promise passed to waitUntil throw, as that'll cause the install to fail. If you're calling an async function for this, you can just throw one:

addEventListener("install", installEvent => {
    installEvent.waitUntil(
        (async () => {
            const resp = await fetch("/something");
            // ^ Network errors will throw by themselves, so no try/catch needed
            if (! resp.ok) throw "Install failed";

            // ...
        })()
    );
});

#12: Passthrough Fetch Listeners Can be Fine

A passthrough fetch listener looks like this:

addEventListener("fetch", fetchEvent => {
    fetchEvent.respondWith(fetch(fetchEvent.request));
});

Many articles say to avoid doing things like this and I think it's true to some extent. Certainly if you just have that code, you might as well not register a fetch listener. However, having a passthrough listener allows you to use and/or modify the response before sending it, which can be useful. For example, by default in Versioned Worker, I always call respondWith for same origin requests, since the alternative is more confusing and usually not worth the headache.

So you can assess potential tradeoffs, here are the actual consequences of passthrough listeners like this:

The streaming of responses won't be able to be properly terminated. This is particularly problematic with video and audio, which often stop responses when skipping.
Request priorities might be less precise or lost entirely.
There's a slight increase in the initial latency. I've typically measured 0.6ms per request when the site is served over localhost, but it might be different over the internet.

#13: The Cache API Doesn't Support Range Requests

This combined with #12 makes offline streaming of video and audio an interesting challenge I haven't attempted yet. For now, in Versioned Worker I've just decided to treat range requests as full requests if it's in the cache list. But partial content responses can be saved if you chunk them up yourself and either store them in IndexedDB or as a virtual route (like /__prefix-that-doesnt-clash__/video/1/2) in CacheStorage.

If case you're wondering, you just get an error if the response status is 206 (partial content).

#14: Cache Carefully

When caching resources, it's important you think about how you'll update them. While you can't accidentally stop yourself from updating your service worker, you don't want to be fighting it while you're releasing a critical fix. I use a slightly unconventional approach in Versioned Worker of generating files that tell the service worker what's changed, so I can't comment too much on the matter, but think about your durations and the effects of them.

Also, be careful when storing responses that were from the browser's HTTP cache, as they might be outdated already (but not necessarily stale yet). If you need to ensure a response is up-to-date, fetch it with the mode of "no-cache" to check the cache with the server or "no-store" to ignore the cache entirely.

Phew, that's everything. Hopefully you found this useful. Let me know in the comments if you've got any questions. 👋

DEV Community