DEV Community

Mythbusting DOM: Was DOM Invented Alongside HTML?

Serhii Babich on July 03, 2024

There is a common belief that the DOM emerged simultaneously with HTML and has always been an integral part of web development, with developers hav...

Read full post

Eckehard • Jul 3 '24

Even though Tim Bernder Lee is named as the "inventor" of HTML, he did not start with nothing. The Idea of a Standard Generalized Markup Language (SGML) had been around for quite some years. There are other languages like XML that have their own parsers too. XML and HTML are very similar, despite XML has no fixed keywords.

You can think of an XML-parser like a fast database. After reading an XML-file, many parsers keep the content in memory giving your access to the content. Does this sound familiar? At it´s core, each XML-parser is like the DOM.

So, the DOM is - at best - one under many.

But HTML has it´s quirks that possibly can drive parser developers nuts. Usually, a well formed XML-document is built like a tree. Each node has to be closed before you open a new one. This does not seem to be so strict in HTML. You can easily do something like this:

<p>this is <i><b>paragraph 1</p>
<p>this is paragraph 2</p>
<p>this is </i>para</b>graph 3</p>

This is not well formed, tags are overlapping, but it is displayed as expected.

Serhii Babich • Jul 4 '24

Speaking of SGML — I mentioned this in my previous article;

"There are other languages like XML" — and there is an upcoming article about that.

Talia • Jul 9 '24

The reason why HTML is effectively the DNA of the web is that however you make your webpage, it will still end up with HTML in the browser. That's the danger, to me, of moving away from HTML too much. It puts more layers between your code and what the user sees and uses.

Yes, most HTML elements don't have much functionality, but many do have semantic meanings that can be meaningful for assistive technologies such as screen readers as well as search engines. So you have to be careful about that too.

Eckehard • Jul 12 '24

This is not correct. Frameworks like React build the DOM directly with JS.

Talia • Jul 16 '24

I guess I don't understand what you mean. What's in the browser is still HTML.

Eckehard • Jul 22 '24

No, HTML is just a text format. It tells the browser how to build the DOM. But you can also use API-calls to build the DOM from Javascript. This is what React does.

Eckehard • Jul 4 '24

Many web developers think, HTML is somehow the DNA of the web. But it was just a child of it´s time and created with a certain task in mind: How to display scientific documents on a remote machine. Many decisions where made with respect to the limitations of the web these times, people used acoustic-couplers and green-on-black monitors, so each byte counted.

There had been other presentations formats around for more than 10 years, like Postscript, that was much more powerful. But the web was simply not powerful enough to transfer these documents.

Funny enough, the limitations are long gone, but we still use the format. Feels a bit like using stone tools in a time where 3-D-printing is already possible.

Charles F. Munat • Jul 4 '24

Wait ... you're still using HTML v1?

Eckehard • Jul 4 '24

No, I´m not using HTML at all. It´s pretty simple to build the DOM directly, so why should I use HTML as a meta language?

Building the DOM programmatically has many advantages. You can deliver different pages on different devices using routines that know how to do the job. No need to use any fancy CSS tricks. And DOM elements created from Javascript can be accessed much easier than using "getElementBy...". So, for me using HTML is just a burden that does not pay back.

For plain text markdowns do a good job. But why should a markdown parser convert to HTML if it can build the DOM directly?

For me there is absolute no reason to use any HTML at all. But I can understand that people want to stick with what they are used to. It´s just that I do not want to ride a horse when I can ride a Tesla.

Serhii Babich • Jul 5 '24

But you still need to know HTML to properly build DOM programmatically.

Eckehard • Jul 5 '24 • Edited

Well, this is an interesting question!

Plain HTML does not provide much of what I need to build web applications. This is only a hand full of tags that I use regularly. The interesting things, like formatting tags or pages, come from CSS. And the use of browser API´s, which provide direct access to all the nice features modern browsers have.

All the fancy stuff that was added over the last year, like semantic HTML, all the properties to make CSS "responsive", can be created with some simple Javascript. Today, there are 142 HTML tags (from which i use about 10) and about 520 CSS property names (if I belive common sources). There is a property for every possible situation you might want to deal with. But every day I find a situation that cannot be handled by this massive system. What, if you want to build a web app, that has a different appearance on fridays? Is there a CSS property to handle this?

I favour an "algorithm over configuration" approach, which gives me all the freedom without learning new properties every day. So, yes, I need some HTML, but maybe not much more than what Tim Bernders Leed had invented...

Frank Edwards • Jul 9 '24

So the MVC idea is not for you?

The idea of separating the jobs of presentation vs. behavior is fairly fundamental in my development projects. I would not (personally) be interested in a web site that was algorithmically generated, as it would make things too complex for me to hack myself. Typically, I take the route of using HTML to layout things that users might want to change (templates for tables, sidebars, etc) and then add programming to handle interactions.

But, one man's garbage is another man's treasure, as they say. :)

Henry Wertz • Jul 10 '24

I had no idea you could generate a page programatically just with DOM, I assumed you needed some HTML for it to work with. Interesting!

Eckehard • Jul 12 '24

See the DML-project or vanjs for example

Andrew Bone • Jul 4 '24

HTML is very forgiving that's true but when it takes the malformed HTML it does transform it before it reaches the DOM.

So you example is actually rendered like this.

<p>this is <i><b>paragraph 1</b></i></p>
<i><b><p>this is paragraph 2</p></b></i>
<b></b>
<p><b><i>this is </i>para</b>graph 3</p>

I like that HTML is forgiving but I'd like to think most people don't write it so crazily 😅

Serhii Babich • Jul 4 '24

As far, as I know, this transformation occurs when browser actually builds DOM tree. It just makes assumptions, like if browser meets new opening <p>, it simply assumes that this is a new paragraph. And so on.