Chasing the best performance of rendering the DOM by hybrids library

#webdev #javascript #webcomponents #functional

This is the third in a series of posts about core concepts of hybrids - a library for creating Web Components from plain objects and pure functions.

It's been a while since the last post of the series. Let's catch up on what we have learned so far. The first post explained how the hybrids made web components definition possible without class and this syntax, but with a truly composable structure with pure functions. The second post described built-in cache and change detection mechanisms, which hide redundant lifecycle methods and keep data in sync in a declarative way. If you haven't read them yet, this is the moment to make up for it:

Finally, we can focus on one of the most critical features of all UI libraries - creating and updating the DOM. I think it is not a surprise that hybrids implements this feature slightly different than most libraries and frameworks:

Render is just yet another factory. The foundation of hybrids is the property descriptor concept. Instead of creating a separate internal structure, the library provides render feature as one of the built-in property factories. It brings important benefits. For example, you are not forced to use it. If a built-in solution does not match your needs, you can create a custom render factory, or define local descriptor, which renders and updates the DOM. Moreover, all specific features built for this purpose are available for other definitions. They are part of the public API.
Render factory is a template engine agnostic. Most of the projects force users to use the template solution chosen by the authors. Sometimes it is even impossible to use another one. It might look right - after all, this is considered to be the main objective of the UI libraries. However, I believe that it is always better to have a choice. Of course, hybrids render factory works out of the box with a built-in template engine, but you can easily replace it with React, lit-html or your favorite UI library (The only constraint is that it has to create and update the DOM).
You will always have the fastest solution. Whether you decide to use render factory or not, and whatever template engine you apply - you will still benefit from the hybrids foundations. The cache will prevent redundant calculations, while the change detection mechanism will schedule updates at the end of the next frame in the browser.

I could list many other advantages, but let's face it - rendering the DOM it's all about performance! How does it apply to hybrids? Even though being the fastest rendering library was never the primary goal, from the very beginning hybrids has provided performant DOM rendering. However, recent updates in the project show that some concepts had to be polished. I would like to share with you how I get to those changes, and how they helped hybrids be able to chase the performance of the fastest libraries.

Trigger for investigation

Last December, Vincent Ogloblinsky wrote to me about Web Components Benchmark project. He has created two suites of tests measuring the performance of the web components UI libraries, as well as some mainstream frameworks. Thank you, Vincent, for adding hybrids to your project!

If you would look at the results of the Todo List test, hybrids was somewhere in the middle. The stress test result was more disturbing (the test is about rendering thousands of elements of the Pascal Triangle, which has one hundred rows). The vanilla implementation was below 3 seconds. What about hybrids? It was more than 10 seconds! I thought that implementation might be wrong, but after a closer look, it became clear that some of my assumptions were wrong.

Recursion

When I run the Pascal Triangle test on my local machine, the first thing I noticed was an error message in the console:

Uncaught RangeError: Maximum call stack size exceeded
    at WeakMap.get (<anonymous>)
    at c (render.js:20)
    at c (render.js:30)
    at c (render.js:30)
    at c (render.js:30)
    ...

Ups... The render factory was using recursion in the update process. As long as a list of elements to render was lower than the call stack limit of the JavaScript engine (for V8 it is about 10k), everything worked. However, for one hundred rows, it blew up. I checked, and the safe number is 95. It was very close to not discover the problem!

By the way, the score of the test was even better than it should, as computation stopped before the end of the queue.

The obvious solution is to replace recursion with iteration, where you hold and replace the current item in the variable instead of calling the same function on the end of the loop. The same computation using iteration is also much faster than with recursion.

DOM Events

The second discovered problem was the change detection mechanism. It was built on top of the DOM events. I thought that using them is the right decision. After all, the library is about HTML elements, and they have built-in support for listening and dispatching events. Why should we create a custom structure if we can use "the platform"?

However, I missed one crucial fact - dispatching events can take half of the time of the rendering process if there are many elements there. Take a look at the fragment of Chrome Dev Tools performance chart:

When Pascal Triangle items are connected for the first time, they dispatch the DOM event to trigger their render process (controlled by change detection mechanism). This event is listened by render property of the element, which eventually triggers an update of the DOM of the item. More or less dispatching events takes the same amount of time as putting them in the document. However, if you look at the chart again, you can see another thing - the update process is split between several animation frames after the initial render.

Multiple calls to `requestAnimationFrame` API

In the time when I was rapidly developing hybrids, the asynchronous rendering of the React Fabric was a hot topic. Creating not blocking user input rendering was a tempting idea. And I deemed it's quite easy to implement. The render factory was already using requestAnimationFrame API to schedule the update. The only thing which I had to add was to split the work if the update lasted too long.

We always dream of 60 FPS, so without thinking twice, I set ~16ms budget. After the threshold, the rest of the work was done in the next animation frame (within the own ~16ms budget). No user input blocking, updates in the middle of rendering... It seems to be the holy grail... but it isn't. After each animation frame, the browser has to do a lot of work - recalculate styles, compose the layout, update layer tree, and eventually paint all of that on the screen. Simple structures of elements rarely hit the threshold. If your structure is massive on another hand - the sum of separated executions between frames will always be higher than done in a single one. But without it we might block user input for a long time, don't we?

To make it faster just do less

The above statement seems to be the obvious truth. But authors of some libraries claimed in the past that JavaScript is fast enough - the only problem is the DOM. However, studying performance charts of the Pascal Triangle test taught me, that every variable, call to function or iteration has a cost. We can't avoid some of the work, but functions can be less, data structures can be more straightforward, and iterations might be reduced or scheduled smarter.

The results

On the 29th of May, hybrids hit a new major version, with significant performance improvements. The change detection mechanism has been redesigned. Scheduler, which was an internal part of the render factory is now available for all descriptors. Also, it does not use recursion in the implementation. Instead of attaching and removing event listeners, you should use observe method. It's called in the property scope and only if the property value has changed (it also tracks all dependencies and notify if they change). The callback is queued with requestAnimationFrame API but without the threshold. In the result render factory is now implemented within 30 lines of code. The rest is now an internal part of the library.

If you wonder how those changes apply to the Pascal Triangle test, I am happy to say that time dropped from 10 to 4.3 seconds. It's now less than half of the previous result! The test takes place in a throttled environment (CPU and network are slowed down), so the differences between scores are more important than absolute time, so check out other results on the project home page.

The hybrids library is not yet the number one in the Pascal Triangle test. However, take into account that this is a synthetic test. I would not recommend creating UI, which at one time renders more than five thousands of elements. What is worth to mention is how hybrids performs when we increase the number of rows. When we change the length property from one hundred to one hundred and one, re-render takes 100ms in the throttled environment, while without throttling, it's less than 35ms!

On another hand, Todo List test is much closer to real usage. Before the changes hybrids was somewhere in the middle, but now results are much better - in some areas it is even close to the best in the competition!

Unleashed the power of cache

Decoupling change detection from the DOM has one unique hidden goal, which is not directly related to performance issues. From now, it is possible to attach a cache mechanism to objects, that are not HTML elements. Why is it important? A few months ago, I started working on a new built-in factory - the store. The main goal is to create state management for asynchronous data using all of the hybrids goodies. Without the ability to apply the cache mechanism on that data, it would not be possible. As usual in hybrids, this factory won't be another clone of an existing solution. The idea is to combine fetching, storing, caching, and serving data to your elements in a seamless way as possible. Stay tuned for more details in the next months!

What's next?

The DOM rendering will be as fast as its the weakest point. The render factory is, for now, free of performance issues, but what about the template engine? In the next post of the series, we will learn more about the features of the built-in template engine. Even though it may look similar to lit-html at first, with a closer look, it shows unique patterns taken from core concepts of the library.

In the meantime, you can read more about the library at the project documentation.

🙏 How can you support the project? Give the GitHub repository a ⭐️, comment below ⬇️ and spread the news about hybrids to the world 📢!

Cover photo by Saffu on Unsplash

DEV Community