Shipping performant applications is not an easy task. It’s probably not your fault that the app you’re working on is slow. But why is it so difficult? How can we do better?
In this article we’ll try and understand the problem space and how new-generation frameworks might be the way forward for the industry.
Google has been trail-blazing the way we measure performance, and nowadays, Core Web Vitals (CWV) are the de facto best way to tell whether our site's performance is or isn’t a good user experience.
The three main metrics that are measured are
- Largest Contentful Paint (LCP) - how fast a user can see something meaningful on your page.
- First Input Delay (FID) - how fast a user can interact with your content, eg. can a button or drop-down be clicked/touched and work?
- Cumulative Layout Shift (CLS) - how much content moves around or changes which interprets the user experience.
With these measurements, a cumulative score can be extracted, a score from 0-100. Generally, any score above 90 is good. You can measure these metrics yourself in your browser dev tools under the “Lighthouse” tab or use a tool (by Google) like Page Speed Insights.
To get some context and data from the wild, we can use Google CrUX (Google Chrome User Experience Report), or more easily, we can use some data extracted by the amazing Dan Shappir, from his recent article in Smashing Magazine.
Dan compared performance between frameworks by looking at websites built with them and only those that have a green CWV score, which should provide the best user experience.
His conclusion is that you can build performant websites with most of the current frameworks, but you can also build slow ones. He also mentions that using a meta framework (or web application framework, as he calls it) is not a silver bullet for creating performant applications, despite their SSG / SSR capabilities.
TasteJS encourages building the same applications across frameworks so that developers can compare the ergonomics and performance of these solutions.
For raw data and the methodology, see our TasteJS Movies Comparison spreadsheet.
Disclaimer: Perf measurement is hard! Chances are we did something wrong! Also, things are never apples-to-apples comparisons. So think of this more as a general discussion of trade-offs rather than a final work on performance. Please let us know if we could improve our methodology or if you think we missed something. Do note that the versions on the different frameworks are not the latest, nor reflect the best stack choices. For example, the Next.js app in these tests is 12.2.5, React 17.0.2, and uses Redux. This is not inline with new and improved approaches that the Vercel team have implemented in Next.js 13 which leverage React 18, React Server Components concurrent features, and generally reduce client side JS.
General methodology: For Page Speed Insights Data we've ran the test 3 times and selected the highest of the 3. Also, these tests are somewhat flaky, like one day the show one score and the next a deviation of 5-10 points up or down.
Worth noting that deploy targets aren't the same which is another variant on performance.
Updated: 11.23 - Some apps may have been worked on since the time of writing this post.
The kind of performance we are interested in is startup performance. How long does it take from the time I navigate to a page until I can interact with that page so that I can get the information I want? Let's start with the filmstrip to get a high-level overview of how these frameworks perform.
The above filmstrips show that the Qwik version delivers content faster (ie. better FCP - First Contentful Paint, as can be seen by the red box) than the other versions. This is because Qwik is SSR/SSG first (Server Side Rendering / Static Site Generation) and specifically focuses on this use case. The text shows up on the first frame and the image shortly after. The other frameworks have a clear progression of rendering where the text content is delivered over many frames, implying client-side rendering. (It's worth noting that although Next.js includes server-rendering by default, the Next.js Movie app does not seem to fully server-render its pages which is likely affecting its results.) Angular Universal is an interesting outlier because it has the text content immediately and then goes to a blank page only to re-render the content (I believe this is because Angular does not reuse the DOM nodes during hydration).
Let's look at the corresponding Page Speed Scores.
Let's dive deeper into how PageSpeed is calculated.
Let's define them quickly and then explain more about them:
- TBT (Total Blocking Time): The Total Blocking Time (TBT) metric measures the total amount of time between First Contentful Paint (FCP) and Time to Interactive (TTI) where the main thread was blocked for long enough to prevent input responsiveness.
- LCP (Largest Contentful Paint): LCP measures the time from when the user initiates loading the page until the largest image or text block is rendered within the viewport.
- TTI (Time to Interactive): The amount of time it takes for the page to become fully interactive.
As long as the application delivers the content through SSR/SSG, and images are optimized, it should get a good LCP. In this sense, no particular framework has any advantage over any other framework as long as it supports SSR/SSG. Optimizing LCP is purely in the developer's hands, so we are going to ignore it in this discussion.
Both TBT and TTI are important. TBT measures the longest unbroken chunk of work that frameworks do on startup. A lot of frameworks break up the work into smaller chunks to spread the workload. Qwik has a particular advantage here because it uses resumability. Resumability is essentially free because it only requires the deserialization of the JSON state, which is fast.
The result of resumability cost is clearly evident in the above graph as the one with the least TBT and TTI. It is worth pointing out that the TBT/TTI cost should stay relatively constant for resumability, even as the application gets more complex. With hydration, the TTI will increase as the application gets larger.
Now let's look at the amount of JS delivered to the browser at startup.
I will claim that your TBT/TTI is directly proportional to the amount of the initial code delivered and executed in the browser. The frameworks need to be in a “lazy loading and bundling” business to have any chance of delivering less code. This is harder than it sounds because a framework needs to have a way to break up the code so that it can be delivered in chunks and a way to determine which chunks are needed and which are not on startup. Except for Qwik, most frameworks keep bundling out of their scope. The result is overly large initial bundle sizes that negatively impact the startup performance.
It is not about writing less code (all of these apps have similar complexity) but about being intelligent about which code must be delivered and executed on startup. Again, these can be complex apps, not just static pages.
WOW, look at the difference between the fastest and slowest. Here, the slowest takes 25 times longer to initialize than the fastest. But even the difference between the fastest and second fastest is still ten times faster. Once again, it shows how much cheaper resumability is than hydration. And this is for a demo app. The difference would become even more pronounced with more real-world applications with more code.
I find it telling that the other solutions presented on the TasteJS site have "optimized" solutions presented. Implying that, at first, the solution was not ideal and that a developer spent time "optimizing it." The Qwik applications presented have not been optimized in any way.
“Movies” is a relatively simple app that does not approach the complexity of real-world applications. As such, all frameworks should get 100/100 at this point without the developer doing any optimization. If you can't get 100/100 on a demo, there is no hope of getting anything close to 100/100 in real production applications that deal with real-world complexity.
These comparisons show that application startup performance is directly correlated to the amount of JS that the browser needs to download and execute on startup. Too much JS is the killer of application startup performance! The frameworks which do best are the ones that deliver and execute the least amount of JS.
The implication of the above is that frameworks should take it as their core responsibility to minimize the amount of JS required by the browser to execute on startup. It is not about developers creating less JS or having to manually tweak and mark specific areas for lazy loading but about frameworks not requiring that all JS be delivered upfront and executed automatically
Finally, while there is always space for developer optimization, it is not something that the developer should do most of the time. The framework should just produce optimized output by default, especially on trivial demo applications.
Nevertheless, the future of JS frameworks is exciting. As we’ve seen from the data, Astro is doing some things right alongside Qwik. However, more noteworthy frameworks such as Marko and Solid are also paving the path forward with some similar traits and better performance benchmarks. We’ve come back full circle in web development - from PHP/Rails to SPAs and now back to SSR. Maybe we just need to break the cycle.