Anton Korzunov

Posted on Oct 4, 2019 • Edited on Nov 13, 2019

✂️ Code splitting - What, When and Why

#react #webdev #codesplitting

What? Is literally a "what" - a "thing", and usually a component. What else you might desire?
When? Is much harder to define, as long as for so many of you it’s not when, but where - at a route or component level. However, when is a very flexible question - it is when to split, as well as when to load a deferred chunk?
Why? Is the best question ever. Even if code-splitting is cool - it has to have good reasons to exists. It should make sense. It should worth it.

Let’s make the first stop here.

Why

It does not matter what you are thinking about reasons to use code splitting - to have many bundles(like for better caching), or don’t send to the client code they don’t need(according to the coverage) - there is only one true reason to use code splitting.

One and the only one true reason - it should make things faster.

However, “faster” is not a super clear metric. Faster what? Faster in what? Faster for whom? Faster because of what?
Your App is a complex thing - HTML, CSS, Images, JavaScript, Memory, Network, CPU - everything could be a bottleneck, everything could be not properly utilised, and everything could not be a problem at all.

Today there are only two real limitations - Network (to get the data over the air), and CPU (to use it somehow). Everything has a cost, JS has a Cost and JSON has a Cost. However, are said - the CPU cost of JS is much bigger than network (unless you are physically unable to download required script), and it's much harder to execute it than to download. Let me cite The Cost Of JS again:

Let’s say it differently - all devices have roughly the same download speed in the same place, and it could be good and it could be bad. And you can't control it. And it does not matter which device your customer use - it's environment limitation, not something you can fix with a better phone.
However, you might get a better device in terms of CPU, and you probably did, but someone else might not, and that's mean - in terms of CPU capabilities your users are going to use VERY different laptops or phones, and this is something you can "fix".

Long story short - it’s ok to prefetch or preload scripts (network bound), and not ok to execute them all (CPU bound)

Think about it. You don’t have to make your scripts smaller - smaller is not always mean faster - but you have to execute as less as possible. You have to just make them faster. (And yes, usually that means smaller, but not always).

The cost of JS is a result of smaller costs of smaller operations: one or more parse pass, and the execution itself, function by function, module by module.
You can’t bailout the fist parse, you can control the second parse, but executing is all in your hands.

The Cost of JSON proposes a simple way to make it faster - eval it. As long as a majority of us is not using CSP with eval disabled - using Webpack devtool:eval might speedup JS startup time, if you have much code among your bundle, which you can "not call" right now.

Let me cite The Cost Of JS again, and spot the "parse" part on the provided graph - it's just a small part of everything, not that everything.

Roughly parse is close to 30% of the all script "cost", and you can read it is as "your scripts would be 3 times faster"(if you only parse then, not execute). In my personal experiments - I've found that parse might took around 10% of the overall time. Just 10%.

and there are different tools to measure it.

So, the goal is to not execute something you don't need to execute, yet.

When/Where

And when all is about controlling the execution flow. Like “Execute a module function only when you need it”. Ok, so "when you need it"?.

If you don’t need something right now - don’t import it right now. Do it when you need it - literally the Lazy execution model as it should be. Long story short - that's not how your code works. (unless you are lucky)

Fun fact - you don’t need to move “something you don’t need right now” to a separate bundle - the only thing you have to do - don’t statically depend on that stuff.

For example, you can use old good require, which you might call when you need it. Don’t like cjs? Well there is a magic Webpack comment for synchronous dynamic import - (import(/* webpackMode: eager */...))[https://webpack.js.org/api/module-methods/#magic-comments]

"eager": Generates no extra chunk. All modules are included in the current chunk and no additional network requests are made. A Promise is still returned but is already resolved. In contrast to a static import, the module isn't executed until the call to import() is made.

"good require": I've used this pattern for react-focus-on - using require helped to speed up the script by 23ms(x6 CPU slowdown applied) almost for "free", deferring code execution till "the time of use".

The same “good” require is available via “inlineRequires” in Metro bundler, as well as “Lazy” option in a common-js Babel plugin:

And here is one more cool thing about “when” - JS is yet synchronous and single-threaded, as well as your network - you don’t have to download everything first, and then execute everything at once (script defer attribute) - you better interleave network and CPU somehow - execute and download. Actually, Chrome already parses and compiles your JS in a background thread (aka Script Streaming), but executing would be always in the main thread, and it will be a bottleneck.

However, look like Chrome 78 improves this moment, allowing off-main-thread compilation for all scripts, making "parse part" almost "free" - read more about it.

Execute and download. Sounds simply, and some things from future like webassembly and preloading esm modules would improve this moment even further. However any JS is expected to be sooner or later execute, and has to be first downloaded, and then, in any case, executed. So a situation, when you download/prefetch everything, might defer the "first script" readiness, and make everything even a bit slower - first your overload your network downloading stuff and your CPU would be almost idle, then your network would become idle idle, but your CPU would 🔥🔥🔥. It's all about the sequence of events...

So what's the answer? Don't run log tasks, and let the browser do something. Citing The Cost Of Javascript yet again:

However, network<->CPU shenanigans are more common in real production applications, and long as they are lacking any orchestration, and have idealogical problems with “why” and “when”. And usually also with “what”.

What?

Of course, components. What else you can? And what’s the problem.
React provides only React.lazy, which supports components and only components.

And that component should be loaded only via dynamic import due to the Lazy's interface - a promise with .default - Lazy accepts only default export of a module, and that's intentional. Even if you can construct such promise by your own(you can), resolve it with whatever you want(easy), wrap it in whatever you need(why noy) - the initial intention for React lazy API was about more tight future integration with bundlers, thus doing anything, except just import, can be considered as an _antipattern.

and also as the easiest way to break loadable-components or react-universal-component - both libraries are expecting nothing else, but the import they might decode and extract the "real file" from it, to later use that information for, well, deeper bundle internation. You have been warned - don't use artificial promises with code splitting (for any library which does not clearly allow it)!

However, this is a quite unhelpful answer for the “what” you could or should codesplit.

Components - yes, you can. All code-splitting solutions support it.
Libraries - yes, you can. All code-splitting solutions have support for it, sometimes build-in, sometimes as a wrapper around their API(loadable-components, react-loadable, react-imported-component).
Resource files, like i18n messages - are the same “libraries”, however, almost nobody loads them in a “code splittable” way, thus losing all the benefits of SSR import usage tracking.
Code you don’t need straightaway - yes, you can. But like nobody, except Facebook, is doing it (using sidecars for delayed interactivity).

What is also almost always entangled with When and Where, like "what you could code-split here and now?".
What? - A Component. Where? - At a Route level. And what then? When you are going to start loading deferred chunk? What you are going to display while your route is loading? A full-page spinner? Nothing? Are there any options here?

Where are three answers for the "When you are going to start loading":

the first one is the most popular one, and also is wrong - load when LazyComponent would be loaded. So you will not have anything to display and might provide a worse user experience.
the second one is not quite common - use "HTML" prefetch. I mean <link ref='prefetch'> to ask the browser silently download something "you might need in the future", while the browser is idle. Not all code splitting solution supports it, and there are some problems with it - bundlers are not providing any extra API for this, except "magic comments" (and not all bundlers provide even it).
the third - is my favourite - manual prefetch, or even predict. Like if you know when route would be likely fetched next (using guessjs or your own knowledge) - prefetch it after loading the current one. Or preload something behind the link when user points in the link - you will have a up to 300ms to do it, and that could be enough to load almost everything... (I hope)

There are two good examples for the third answer - one is loadable-components documentation about prefetch, and the second one is prefetchable React.lazy:

const findComponentForRoute = (path, routes) => {
  const matchingRoute = routes.find(route =>
    matchPath(path, {
      path: route.path,
      exact: route.exact
    })
  );
  return matchingRoute ? matchingRoute.component : null;
};

const preloadRouteComponent = (path) => {
  const component = findComponentForRoute(path, routes);
  if (component && component.preload) {
    component.preload();
  }
};

<Link
  to={to}
  onMouseEnter={() => preloadRouteComponent(to)}
  {...rest}
/>

There is a hidden "bonus" in this example - you can preloadRouteComponent a "current route" before your Application starts.

And where are three answers for the question "what you could use as a Loading Indication":

a spinner. Disgusting option 🤮
the old page. Display the old page while the new is loading, in other words - block transition. Easy to do with Redux-First-Router, and hard to do with React-Router.
Your Header or Side Navigation. Ie some parts of your App, which persist from page to page.

Surprisingly - the same effect could be achieved with less pain, once you move split point behind a route (as seen at react-loadable), or use templates for routes, or just nested routes, which will keep "common" components (like page header) completely untouched between page transitions.

However, this is a quite unhelpful answer for the “what” you could or should codesplit.

There is the original problem, the code splitting paradox:

small apps are small enough, so you can’t remove any part of it. Thus you can’t reduce the size below some level, and that level nowadays is a bit above “recommended” size limit.
big apps are complex and entangled enough, so, even with code splitting you will pull so many different pieces, so the resulting amount of code would be still huge.

That’s the problem with code splitting - how to get it working “right”. And how to get something valuable from it, not just split one big bundle into the many smaller ones, still loading the same amount of code at the client-side.

Not a joke. import here and import there might force bundler to create multiple chunks, but they all could be loaded together, if you haven't detangled them properly. Like 5 chunks, 10kb each, and all loaded, plu a 2Mb common for all these small and "common" pieces. Probably that's not the goal.

So, yet again - What's the goal?

The goal of code splitting is (you will be surprised!) not to split, but to separate. The idea is to create independent domains, which does not require anything from each other, thus does need code from each other, thus require less code to run. Sounds simple?

Unfortunately, it's easier to say, than to do - there are too many ways to entangle your code and lose benefits from code splitting.

JFYI: once you import('XXX') from YYY - you are giving your bundler a possibility to create tree(3) chunks - XXX, YYY, and everything common between XXX and YYY in case of YYY is also "dynamic". In a bad case, 99% of code would be in that common chunk, or might be even merged back to YYY.

Some people think that microservices, we so love on the backend, are the answer for this domain separation, and it is almost true. But there is no such thing as an isolated microservice - they all talking to each other, doing something, depending on each other (or the big fat database).

Long story short - the only way to code split - is to isolate, and that's not as easy as you might think.

To be more concrete - that's actually easy - there are many techniques to do it, from Dependency Injection and dynamic import itself, to just proper module system management. And I would say - technical solutions(import, lazy and everything else) are less important than a module system setup. Less important than a code domains separation, isolation and splitting.

It's all about a module system, and nothing else than a module system! And there are only 3 things to talk about:

1 - how to combine separated things together, which is also would answer how you might split and detangle your components.
2 - how to control this process
3 - what do you do while something deferred is loading.

1 - how to split and combine

import - dynamic import is your friend. Obvious case.
lazy - is the same dynamic import, but for a component.
DI - if you pull out some extra functionality from one component, and inject it via props from another component - you will make the first one "smaller" and detangle use cases (like pulling off "default" icons from would help icon-free use cases). Composition, combination, and deferring side effects - this is a very powerful, and mostly non-technical approach to handle complex application.

2 - how to control this process

Multiple entry points, directory indexes, import cost and size-limit. And since last week - bundlephobia displays per-export sizes.

You have to understand the cost of things you are using, and the cost of their combination - the cost of things you build on top.

You have to understand how big are small pieces of your application, and what they require for life. Defining size-limit entry for every component and feature, to understand how big is something - is the first step towards optimisation, and your ability to reason about sizes of bigger things.

Like - you have a Button, and it is small. And you have a Select and it is not quite small in kb, but like twice bigger than Button - you can already compare comparable entities and reason about their size.

Like - you have your App, it's 20Mb of JS. You like - well, that's the modern frontend, shit happens. However - once you start breaking it down, measuring the size of every piece, you will be able to reason what is actually big, why do you need that something, what makes it bigger that it should, and when your App could be smaller.

using size-limit you can also measure not only size, but the time of component. Deferring not loading, but the execution of a chunk you might save this extra time, even of you can't save that extra size.

...Some people think that _microservices, we so love on backend, are the answer for this domain separation. However microfrontends, isolated pieces of your app, extracted to separate packages, are. It's just easier to contain, isolate and measure them.
And stuff like (webpack)DLLPlugin would literally double the outcome from this approach, establishing real boundaries between domains..._

As a conclusion - don't forget one thing - a popular mistake is to look only at chunks size, or bundle-analyzer report. No, that's not how code splitting works - entangled things keep being tanged forever.

3 - and what when?

And then comes the most(!) important part - how to make UI and UX better with code splitting, not worse. Including:

displaying something meaningful while you are fetching deferred code
providing sound fallbacks and failbacks
assembling all pieces together faster (than one big thingy you had before)
predicting user actions and prefetching yet missing data
improving the result via Server Side Rendering, Progressive Rendering and Hydration.
And so on.

And the next step would be more technical...

Next step

Now, then we are all a bit sceptical about code splitting - it's time to go deeper into the technical details...

Follow to the next article.

But before you left

But before you left - one quick question :)

"What code-splitting would definitely if not ruin, then make much more complex?"

CSS. It will make CSS selector a much more fragile thing, as long as they are affected by CSS selector rule declaration order much, aka CSS Order Matters, and that order would be affected by the order you open pages and thus load chunks. It would be aaabsoltely unpredictable, requiring different techniques to be solved.

DEV Community