Pere Sola

Posted on Feb 10, 2021

How a website is rendered?

#css #html #beginners

Recently I was asked about the steps that take place when it is rendered. Weeeeeeell, the DOM gets built (I suppose) and then... the stuff shows up on the screen! Right? Hmm, I am sure I can step up my game - so in a good rabbit hole fashion, I set on a quest to research the steps.

I was told that, alongside the DOM tree, there is something called CSS tree. So I Googled css tree vs dom tree. First result was this Stack Overflow thread. From there, I landed here.

Constructing the DOM and CSSOM tree

DOM stands for Document Object Model, and it is an API that allows the developer to to interact with the nodes (it is a tree) in the browser. In order to build these nodes, the browser receives the information through the network and undergoes a process called tokenization. The article says that tokens are groups of characters which provide a template for the DOM tree ('input' will become a token [input]). The browser then used these token to determine how they will relate to each other and build the nodes. The nodes are used to build the tree. So, for instance, the article explains that:

[div][span][p][/p][img/][/span][ul][li][/li][/ul][/div]

becomes:

       [div]
       /   \
      /     \
   [span]   [ul]
    /  \       \
   /    \       \
 [p]    [img]  [li]

We now have the DOM tree and the browser can paint the style elements for each DOM node. The author says that this is a process called reflow. In this process, the CSS is converted to a structure similar to the DOM tree, called CSSOM Tree (CSS Object Model). So far so good. The author promises a follow up post, called Constructing the CSSOM Tree. But guess what... he never did! Anyway, the earlier SO thread is packed with resources, I moved to the next.

Webpage rendering

The title, What Every Frontend Developer Should Know About Webpage Rendering, sounds really exciting. This is what I am looking for, so I dive in.

It first go through a recap of the steps that take place when a browser render a page, expanding upon what I learned in the previous article:

The DOM (Document Object Model) is formed from the HTML that is received from a server.
Styles are loaded and parsed, forming the CSSOM (CSS Object Model).
On top of DOM and CSSOM, a rendering tree is created, which is a set of objects to be rendered (Webkit calls each of those a "renderer" or "render object", while in Gecko it's a "frame"). Render tree reflects the DOM structure except for invisible elements (like the tag or elements that have display:none; set). Each text string is represented in the rendering tree as a separate renderer. Each of the rendering objects contains its corresponding DOM object (or a text block) plus the calculated styles. In other words, the render tree describes the visual representation of a DOM.
For each render tree element, its coordinates are calculated, which is called "layout". Browsers use a flow method which only required one pass to layout all the elements (tables require more than one pass).
Finally, this gets actually displayed in a browser window, a process called "painting".

The author then explains that repaint means that only the style of an element is changed and, therefore, the browser just 'repaints' the element. However, if changes affecting content, structure or position happen, something called reflow takes place. We seem to have another reflow, the first author talked about it in the context of the step after CSSOM is created. Which brings me to the 3rd resource shared in the original SO thread, this one more extensive.

Behind the scenes of modern web browsers

These are the components of a browser:

Browser have a rendering engine, responsible for displaying content on the browser screen. Safari uses an angine called Gecko and Safari and Chrome one called Webkit.

The rendering engine gets the contents of the document requested by the user in the browser from the networking layer. The article goes through the steps, which we already know now:

The render engine parses the HTML document and turns the tags to DOM nodes - this is the content tree.
Same goes on for the style with the CSSOM Tree.
Both are used to create another tree - the render tree.
The render tree goes through a layout process - giving each node the coordinates where it should appear on the screen.
The painting stage - the render tree is traversed and each node is painted using the UI backend layer (see image above).

Stage 1 - Parsing HTML

In step 1 above, the HTML parsing algorithm is key to create the DOM nodes. The author says that this algorithm is described in detailed by the HTML5 specification and has 2 stages: tokenization and tree construction.

Tokenization

Parses the input into HTML tokens: start tags, end tags, attribute names and attribute values.

Tree constructor

The tokenizer recognises the tokens and passes it to the tree constructor before moving on to recognize the next token. And so on and so forth:

During the tree construction the DOM tree, with the Document in its root, will be modified and elements added as we go along.

Stage 2 - parsing CSS

It is now the turn of CSS to be parsed. The CSS file is parsed into a Stylesheet Object:

Stage 3 - render tree

The render tree is composed by visual elements in the order they will be displayed, the visual representation of the document. It is not a 1 to 1 relation with the DOM elements, because there may be DOM nodes that are not displayed: i.e. head tag. Building the render tree requires calculating the visual properties of each render element to be displayed on the screen, and takes resources.

Stage 4 - layout process

The rendered tree above does not have position and size. These values are calculated during layout or reflow stage. A top and left coordinate system relative to the root frame, the html document. The root element position is 0,0 and the dimension is the viewport.

Stage 5 - painting

This is the stage where the render tree is traversed and the content is "painted" on the screen. There is an order

In response to a change, browser may simply repaint (if changes are done to the visual aspect of the element) (stage 5) or layoutand repaint (if position is changed, a DOM element is added, etc.) (stage 4 and 5).

That's it for today. If you want to continue geeking out on the topic, next step may be Google's Web Fundamentals Docs. Happy reading!

Top comments (2)

GrahamTheDev • Feb 10 '21

Great article, filled a couple of gaps in my knowledge.

I don't think the parse through to layout process is a single task anymore depending on the CSS (pure speculation, hence why asking you might actually be able to help my understand something that bugs me!).

see this question and answer on stackoverflow where there was some strange layout shifts with a significantly long page with flexbox.

My understanding is poor but from observation it appear that the parsing of the HTML was split into 2 tasks and lays out / renders the first part early, then discovers the second part and lays out / renders that resulting in a layout shift that on paper shouldn't happen (CSS is inlined, seems to be valid etc.).

It appears to be flexbox related, just wondering if you knew why that would be as I never quite understood?

This is a page that exhibits this behaviour

I would love to know the cause of this as it bugged me that I could not find an answer on it and my best guess was not satisfactory!