There is a common belief that the DOM emerged simultaneously with HTML and has always been an integral part of web development, with developers having tools for dynamic manipulation of HTML elements from the very beginning. However, this is far from the truth. In reality, nearly a decade passed between the emergence of HTML and the creation of the DOM! How did this come about?
It's undeniable that the web's development in the mid-90s progressed at an explosive rate. Just imagine — only four years passed from the creation of the first web page by Tim Berners-Lee to the launch of amazon.com. By 1996, the internet had become so widespread that the first promotional website for a movie, Space Jam, was launched.
However, web development itself was still quite primitive, with a very limited set of tools that couldn't keep up with the rapid industry growth. Consider this — the second numbered version of HTML appeared in 1995 (there wasn't officially a first version), JavaScript's first version was developed in the same year, and CSS1 was released in December 1996. Amidst all this, the DOM was still a distant prospect.
So what prompted the community to create a unified standard? In the mid-90s, the so-called First Browser War was in full swing, with two giants of the time, Netscape Navigator and Internet Explorer, battling it out. In the fight for market share, developers came up with new tricks and features, exacerbating the biggest problem of the time — the lack of a unified approach to implementing standards. Yes, I'm looking at you, Internet Explorer, and your ActiveX.
As a result, each browser had its own tools for working with HTML, meaning simple scripts for animating snowflakes might not work in a competitor's browser if you only tested your code in Internet Explorer or vice versa, in Netscape Navigator. This could and did lead to unpredictable behaviour, bloated code, and logical errors.
In 1994, the World Wide Web Consortium (W3C) was established to standardize web technologies and make life easier for web developers. One of the key initiatives of this organization was the creation of the DOM, or Document Object Model, to standardize interactions with web documents.
The first version of the DOM documentation was published in 1998, marking a significant milestone in web development history. Finally, a standardized way of representing and interacting with HTML documents was introduced, allowing developers to hope their snowflakes would fall the same way in all relevant browsers. The first DOM became the foundation for modern web applications.
However, this didn't mean all web development problems were solved that year. Rather, they reached a new level. Now, besides incompatibility with competitors, most browsers became incompatible with the standard. Some tried to fix this, some ignored it, and some pretended that the most standard standards were only what they did, while other standards were not so standard. The fact that the famous jQuery emerged only in 2006 vividly indicates that the cross-browser compatibility issue not only didn't disappear but flourished eight years after the DOM standard appeared.
But that's a story for another time.
Top comments (30)
Even though Tim Bernder Lee is named as the "inventor" of HTML, he did not start with nothing. The Idea of a Standard Generalized Markup Language (SGML) had been around for quite some years. There are other languages like XML that have their own parsers too. XML and HTML are very similar, despite XML has no fixed keywords.
You can think of an XML-parser like a fast database. After reading an XML-file, many parsers keep the content in memory giving your access to the content. Does this sound familiar? At it´s core, each XML-parser is like the DOM.
So, the DOM is - at best - one under many.
But HTML has it´s quirks that possibly can drive parser developers nuts. Usually, a well formed XML-document is built like a tree. Each node has to be closed before you open a new one. This does not seem to be so strict in HTML. You can easily do something like this:
This is not well formed, tags are overlapping, but it is displayed as expected.
Speaking of SGML — I mentioned this in my previous article;
"There are other languages like XML" — and there is an upcoming article about that.
The reason why HTML is effectively the DNA of the web is that however you make your webpage, it will still end up with HTML in the browser. That's the danger, to me, of moving away from HTML too much. It puts more layers between your code and what the user sees and uses.
Yes, most HTML elements don't have much functionality, but many do have semantic meanings that can be meaningful for assistive technologies such as screen readers as well as search engines. So you have to be careful about that too.
This is not correct. Frameworks like React build the DOM directly with JS.
I guess I don't understand what you mean. What's in the browser is still HTML.
No, HTML is just a text format. It tells the browser how to build the DOM. But you can also use API-calls to build the DOM from Javascript. This is what React does.
Many web developers think, HTML is somehow the DNA of the web. But it was just a child of it´s time and created with a certain task in mind: How to display scientific documents on a remote machine. Many decisions where made with respect to the limitations of the web these times, people used acoustic-couplers and green-on-black monitors, so each byte counted.
There had been other presentations formats around for more than 10 years, like Postscript, that was much more powerful. But the web was simply not powerful enough to transfer these documents.
Funny enough, the limitations are long gone, but we still use the format. Feels a bit like using stone tools in a time where 3-D-printing is already possible.
Wait ... you're still using HTML v1?
No, I´m not using HTML at all. It´s pretty simple to build the DOM directly, so why should I use HTML as a meta language?
Building the DOM programmatically has many advantages. You can deliver different pages on different devices using routines that know how to do the job. No need to use any fancy CSS tricks. And DOM elements created from Javascript can be accessed much easier than using "getElementBy...". So, for me using HTML is just a burden that does not pay back.
For plain text markdowns do a good job. But why should a markdown parser convert to HTML if it can build the DOM directly?
For me there is absolute no reason to use any HTML at all. But I can understand that people want to stick with what they are used to. It´s just that I do not want to ride a horse when I can ride a Tesla.
But you still need to know HTML to properly build DOM programmatically.
Well, this is an interesting question!
Plain HTML does not provide much of what I need to build web applications. This is only a hand full of tags that I use regularly. The interesting things, like formatting tags or pages, come from CSS. And the use of browser API´s, which provide direct access to all the nice features modern browsers have.
All the fancy stuff that was added over the last year, like semantic HTML, all the properties to make CSS "responsive", can be created with some simple Javascript. Today, there are 142 HTML tags (from which i use about 10) and about 520 CSS property names (if I belive common sources). There is a property for every possible situation you might want to deal with. But every day I find a situation that cannot be handled by this massive system. What, if you want to build a web app, that has a different appearance on fridays? Is there a CSS property to handle this?
I favour an "algorithm over configuration" approach, which gives me all the freedom without learning new properties every day. So, yes, I need some HTML, but maybe not much more than what Tim Bernders Leed had invented...
So the MVC idea is not for you?
The idea of separating the jobs of presentation vs. behavior is fairly fundamental in my development projects. I would not (personally) be interested in a web site that was algorithmically generated, as it would make things too complex for me to hack myself. Typically, I take the route of using HTML to layout things that users might want to change (templates for tables, sidebars, etc) and then add programming to handle interactions.
But, one man's garbage is another man's treasure, as they say. :)
I had no idea you could generate a page programatically just with DOM, I assumed you needed some HTML for it to work with. Interesting!
See the DML-project or vanjs for example
HTML is very forgiving that's true but when it takes the malformed HTML it does transform it before it reaches the DOM.
So you example is actually rendered like this.
I like that HTML is forgiving but I'd like to think most people don't write it so crazily 😅
As far, as I know, this transformation occurs when browser actually builds DOM tree. It just makes assumptions, like if browser meets new opening
<p>
, it simply assumes that this is a new paragraph. And so on.Would be interesting to know if there are differences how browsers deal with malformed HTML....
Thank you for clarification.
Is there any documentation on this kind of "transformation"? I suppose, it is not that easy to find a rule that works under any condition?
DOM is not DOOM 🙂
Rrright!
Waiting for the next chapter of this exciting story 😀
I fell smarter after reading it 😅
That was a plan!
This is very interesting article. Thanks!
Thank you!
DOM is not HTML! Nothing is HTML at all. Only HTML is HTML😁
DOM is not HTML)
Or is it? Let's try to figure out in my next article ;)
Super, I didn't know that DOM is not HTML
Aaaand I can't find where this article stated DOM is or is not an HTML