The Browser render Story (3 Part Series)
Table of contents
- Overview
- From raw bytes of HTML to DOM
- From raw bytes of CSS to CSSOM
- Laying out the render tree
- The Process of rendering JS
This article explains how your browser converts HTML, CSS, and JavaScript into a working website that you can interact with.
Let’s get started.
Overview
A web browser is a piece of software that loads files from a remote server (or perhaps a local disk) and displays them to you allowing for user interaction.
However, within a browser, there’s a piece of software that figures out what to display to you based on the files it receives. This is called the browser engine.
The browser engine is a core software component of every major browser, and different browser manufacturers call their engines by different names. The browser engine for Firefox is called Gecko, and Chrome’s is called Blink, which happens to be a fork of WebKit.
1. The process of rendering HTML
When you write some HTML, CSS, and JS, and attempt to open the HTML file in your browser, the browser reads the raw bytes of HTML from your hard disk (or network).
Note- The browser reads the raw bytes of data, and not the actual characters of code you have written.
The browser receives the bytes of data, but it can’t really do anything with it; the raw bytes of data must be converted to a form it understands. This is the first step.
From raw bytes of HTML to DOM
First, the raw bytes of data are converted into characters. This conversion is done based on the character encoding of the HTML file.
Characters are great, but they aren’t the final result. These characters are further parsed into something called tokens.
Without this tokenization process, the bunch of characters will just result in a bunch of meaningless text, i.e., HTML code — and that doesn’t produce an actual website.
When you save a file with the .html extension, you signal to the browser engine to interpret the file as an HTML document. The way the browser interprets this file is by first parsing it. In the parsing process, and particularly during tokenization, every start and end HTML tag in the file is accounted for.
The parser understands each string in angle brackets (e.g. - <html>, <p>
) and understands the set of rules that apply to each of them. For example, a token that represents an anchor tag will have different properties from one that represents a paragraph token.
Conceptually, you may see a token as some sort of data structure that contains information about a certain HTML tag. Essentially, an HTML file is broken down into small units of parsing called tokens. This is how the browser begins to understand what you’ve written.
Tokens are great, but they are also not our final result. After the tokenization is done, the tokens are then converted into nodes. You may think of nodes as distinct objects with specific properties. In fact, a better way to explain this is to see a node as a separate entity within the document object tree.
Nodes are great, but they still aren’t the final results.
Now, here’s the final bit. Upon creating these nodes, the nodes are then linked in a tree data structure known as the DOM. The DOM establishes the parent-child relationships, adjacent sibling relationships, etc. The relationship between every node is established in this DOM object.
We don’t open the CSS or JS file in the browser to view a webpage. We open the HTML file, most times in the form index.html. This is exactly why you do so: the browser must go through transforming the raw bytes of HTML data into the DOM before anything can happen.
Depending on how large the HTML file is, the DOM construction process may take some time. No matter how small, it does take some time, regardless of the file size.
2. The process of rendering CSS
A typical HTML file with some CSS will have the stylesheet linked as shown below:
<!DOCTYPE html>
<html lang="en">
<head>
<link rel="stylesheet" href="style.css">
<title>Document</title>
</head>
<body>
</body>
</html>
While the browser receives the raw bytes of data and kicks off the DOM construction process, it will also make a request to fetch the main.css
stylesheet linked. As soon the browser begins to parse the HTML, upon finding a link
tag to a CSS file, it simultaneously makes a request to fetch that.
From raw bytes of CSS to CSSOM
As you can see, a similar process occurs when the browser receives raw CSS bytes.
This is achieved by converting raw bytes of data into characters, then tokenizing them. In addition, nodes are formed, and then a tree structure is formed.
DOM stands for Document Object Model. Most people know what it is. In the same way, CSS has a tree structure called CSS Object Model (CSSOM).
Both HTML and CSS cannot be accessed by the browser by raw bytes. This has to be converted to a form it recognizes — and that happens to be these tree structures.
CSS has something called the "cascade". The cascade is how the browser determines what styles are applied to an element. Because styles affecting an element may come from a parent element (i.e., via inheritance), or have been set on the element themselves, the CSSOM tree structure becomes important.
All well and good. The browser has the DOM and CSSOM objects. Can we have something rendered to the screen now?
The render tree
What we have right now are two independent tree structures that don’t seem to have a common goal.
The DOM and CSSOM tree structures are two independent structures. The DOM contains all the information about the page’s HTML element’s relationships, while the CSSOM contains information on how the elements are styled.
OK, the browser now combines the DOM and CSSOM trees into something called a render tree.
(credit: internet)
The render tree contains information on all visible DOM content on the page and all the required CSSOM information for the different nodes. Note that if an element has been hidden by CSS (e.g., by using display; none), the node will not be represented in the render tree.
The hidden element will be present in the DOM but not the render tree. This is because the render tree combines information from both the DOM and the CSSOM, so it knows not to include a hidden element in the tree.
With the render tree constructed, the browser moves on to the next step: layout!
Laying out the render tree
Well, first, the browser has to calculate the exact size and position of each object on the page. It’s like passing on the content and style information of all elements to be rendered on the page to a talented mathematician. This mathematician then figures out the exact position.
This layout step (which you’ll sometimes hear called the “reflow” step) takes into consideration the content and style received from the DOM and CSSOM and does all the necessary layout computing.
With the information about the exact positions of each element now computed, all that is left is to “paint” the elements to the screen. The browser now “paints” the individual node to the screen. Finally, the elements are now rendered to the screen!
3.The Process of rendering JS
You can remove and add elements from the DOM tree, and you may modify the CSSOM properties of an element via JavaScript, and this JS code written in <script>
tag.
Well, one of the most important things to remember is that whenever the browser encounters a script tag, the DOM construction is paused! The entire DOM construction process is halted until the script finishes executing.
This is because JavaScript can alter both the DOM and CSSOM. Because the browser isn’t sure what this particular JavaScript will do, it takes precautions by halting the entire DOM construction altogether.
Best practice is always to put script tag right to the end, just before the closing of body tag. So that the UI will be shown to the user as a first priority.
<head>
<title>Demo</title>
<link rel="stylesheet" href="style.css">
</head>
<body>
<p id="paragraph">How Browser Rendering Works</p>
<div><img src="https://xyz.jpg">
<script>
let header = document.getElementById("paragraph");
console.log("paragraph is: ", paragraph);
</script>
</body>
</html>
Here in the above scenario you'll get the proper output in the cosole as- paragraph is: <p id="paragraph">How Browser Rendering Works</p>
.
But if you put script tag in the head or at the starting of the body then you will get paragraph is: null
Why? Pretty simple.
While the HTML parser was in the process of constructing the DOM, a script tag was found. At this time, the body tag and all its content had not been parsed. The DOM construction is halted until the script’s execution is complete.
By the time the script attempted to access a DOM node with an id of header, it didn’t exist because the DOM had not finished parsing the document!
So, what happens when the parser encounters a script tag but the CSSOM isn’t ready yet?
Well, the answer turns out to be simple: the Javascript execution will be halted until the CSSOM is ready.
The async attribute
By default, every script is a parser blocker! The DOM construction will always be halted.
There’s a way to change this default behavior though.
If you add the async keyword to the script tag, the DOM construction will not be halted. The DOM construction will be continued, and the script will be executed when it is done downloading and ready.
Here’s an example:
<head>
<title>Demo</title>
<link rel="stylesheet" href="style.css">
<script src="index.js" async></script>
</head>
<body>
<p id="paragraph">How Browser Rendering Works</p>
<div><img src="https://xyz.jpg">
</body>
</html>
Top comments (1)
Сongratulations 🥳! Your article hit the top posts for the week - dev.to/fruntend/top-10-posts-for-f...
Keep it up 👍