DEV Community

Cover image for HTML Semantics Under the Hood
Camilo Micheletto
Camilo Micheletto

Posted on

HTML Semantics Under the Hood

Can you explain what semantics means in HTML?

I feel like my entire life people have summed it up as "Using HTML tags correctly", "helping with accessibility and SEO", but none of this fully answers the question, does it?

I asked on Twitter how people would explain what semantics is, and except for a few people who had a very strong experience in accessibility, the answers were mostly similar to what I have heard throughout my career.

Tweet saying: I understand semantic HTML as using tags that bring a meaning beyond the generic div, for example. For whom? Apparently for robots. Because for humans I find it of little use.
Tweet saying: I understand semantic HTML as using tags that bring a meaning beyond the generic div, for example. For whom? Apparently for robots. Because for humans I find it of little use.


Semantics and meaning

What is meaning for us? What is meaning for robots?

What we understand as meaning in semantics is built by three pieces of information, which I will summarize as Name, Role, and State to simplify.

 

Name

These are the naming data of the element - there are several properties that add a name to the elements, such as title, name, content, and aria-properties, etc. There is a predefined order that calculates the name according to their priority.

Screenshot of an Accessibility Object Model of a button in Chrome, its name is "Cancel Order" and this name was defined by the button's content. If it had a label, aria-label, or aria-labeledby, they would have, respectively, a higher priority in naming the button.

👤 For humans
Name and description are part of the content verbalized by screen readers. In the case of the button, it would say:

"<role> Button. <name> Cancel Order"

Button written "Cancel Order". ClickBus website
Button written "Cancel Order" in ClickBus website

 

🤖 For machines
Without the name, the element is considered a non-palpable element (palpable content) and its reading can be ignored.

 

Role

It describes the expected behavior of the element for the user, crawlers, and assistive technologies. These elements not only provide context, but also enable APIs that will support all types of users to interact with the defined functionalities equivalently.

The role="alert" attribute does not only adhere to "context", it:

  • Creates a live-region - a mutation observer of the child element that emits an event to the user-agent
  • When there is an error (e.g. form error), the message is dynamically injected into the element, emitting an event
  • The event is captured by assistive technologies and announced to the user

All of this without JavaScript! (only the part of dynamically injecting the message uses JS). This role enables a native API that ensures that the user has access to the error in a way beyond purely visual (red letters with this symbol ❌).

👤 For humans
Demonstrates the semantic description of the element.

🤖 For machines
Enables part of the API required to provide the expected experience.

 

State

It refers to the DOM API and the methods, getters, setters, and defaults of the element.

The disabled attribute not only changes the appearance of the element, but also changes a series of informations in the object of the same:

  • Interactive element becomes unnavigable
  • Non clickable
  • Does not emit events
  • Is not read by assistive technologies

And this state is recursively propagated to all children of this element.

In the following button example - same element, different attributes, methods, and behaviors changing only the context:

A button and a button inside a form. Both have the same class in the DOM, but being inside a form makes this button a submit type and gain content submission methods.

Once in the context of a form, methods related to form submission and even the creation of a FormData are enabled by the addition of the implicit type="submit".

Button methods implementation in Chrome, code taken from the Blink engine


Code extracted from Chrome's rendering engine, Blink (link to the source code).

👤 For humans
Provides the expected behavior for various types of user interaction.

🤖 For machines
Provides various forms of interaction for different user agents and assistive technologies.

What about SEO and content?
Does semantics help with that too?

Tweet saying: Semantics means helping the machine interpret the content, directing context, information priority, etc. Semantic HTML would be if a robot could read your web page knowing where the content is, what the content is about, what is not important... etc.
Tweet saying: Semantics means helping the machine interpret the content, directing context, information priority, etc. Semantic HTML would be if a robot could read your web page knowing where the content is, what the content is about, what is not important... etc.


Semantics and content

If we think of semantics as adhering meaning to something, the order, priority, and relationship of content dramatically change the meaning of the content.

The creation of this hierarchy is the role of another API - the outlines.

Outlines create content sections, such as book indexes or college papers.

The numbering that defines the hierarchy of content of headings (h1 to h6) works similarly to the index numerals that demonstrate headers, titles, and subtitles.

A summary of a paper in MLA standard with headers, titles, and subtitles.
A summary of a paper in MLA standard with headers, titles, and subtitles.

 

And it's not just headings that have this role, <section>, <aside>, <article>, and <nav> are sectioning content elements and create a type of outline. These elements can have <header> and <footer> elements whose content will be associated with their section.

Do you understand now how this is so relevant to accessibility? Accessibility is not a favor, nor something detached from HTML, CSS, and JS, but rather using the APIs that these technologies offer to provide an equivalent experience to all users.

In the words of Sandyara Peres, an accessibility expert, semantics are:

In the tweet: In my classes/lectures, I say that: it's the identification of the purpose of elements, influencing their behavior, providing a better experience in terms of: Accessibility; Maintainability & Compatibility.

Semantics are not just "using the right tags", as tags alone do not cover all types of components and use cases that the web can offer.

Semantics are a collection of states, attributes, and methods that enable various ways to access and understand content.


To wrap up

No one has ever taught me semantics in this way; it always seemed merely moralistic, like "writing HTML correctly", "using the right tags".

Perhaps now we can see it as:

"Enabling the appropriate tools for content interpretation."

HTML as a markup language should not be seen solely as a vehicle for implementing design (CSS) and functionality (JS).

Design and functionality are just stars that orbit around the content. Valuing the content is valuing the users.

Top comments (0)