This one's going to be a bit more personal and a bit less technical than my usual fare. It's based on a response I gave to a comment left on a previous article of mine, and I think it's worth expanding into it's own whole thing.
I wrote an article a few weeks ago all about semantic HTML, encouraging web developers to choose a semantic element over a
<div> whenever there's an appropriate one. The article absolutely exploded (at least, way more than anything I've ever written before), which was just so cool.
But after having a few conversations in comments and on Twitter, I realized that a lot of readers seemed to be very excited about and focused on one aspect of semantic tags: how much cleaner semantic code can be than traditional
<div> slinging. And to be sure, this is an important aspect, but in my opinion it's not the most important thing.
In particular, one of the conversations I had in the discussion section (the one that prompted this article) was about whether it's a good idea to go beyond the provided semantic tags and using custom non-standard tags to mark up the document with even more readable HTML. In my opinion, this isn't a good idea, and I explained why over there, but this conversation really highlighted for me that if we're not careful, we can focus too much on how markup looks and forget why standards are defined in the first place.
After lots of reflection and discussion, I've found that there are basically three main reasons to write semantic HTML, three things that make it something that everyone who writes HTML for the web should understand and use:
Code ergonomics, a.k.a. it's better for developers - Semantic elements are much easier to read, both as HTML tags and as CSS selectors. Their purpose and meaning are clearly defined, which makes the structure of your document much more apparent. They also often have certain default styles and behaviors that make our lives as developers easier because we don't have to spend time adding them ourselves.
SEO, a.k.a. it's better for business - Search engines like Google pay attention to semantic HTML on your page, especially if you move past just semantic elements and use microdata attributes, also defined as part of HTML5, to call out the data on your page that search engines care about. Google has been paying attention to this stuff since at least 2010, and they use microdata to create those cards that come up when you search for a business.
Accessibility, a.k.a. it's better for users - Many semantic elements play a very important role in helping users navigate and use your site more easily. This applies double to users who interact with your site in a way other than the mouse + keyboard + monitor + colors + speakers method that is far too often the only perspective considered. Assistive technologies that help users often rely heavily on certain semantic elements to understand web documents, including things like headings (
<h6>), region elements like
<nav>element to call out navigation within the document and/or the rest of the site.
Based on the discussion section on my article and some of the conversations I had surrounding it, I think I made a mistake: I focused a bit too much in my article on reason 1, how much nicer the code is for developers to write, read, and maintain. Don't get me wrong, semantic HTML code definitely is way nicer and more maintainable in almost every case, and that's great for developers. But in my opinion, reason 1 is actually the least important of the three. And the most important, as you've undoubtedly guessed from the title of this article, is reason 3, accessibility.
So here's today's thesis statement:
To me, the biggest reason is that we need the computer to understand what we're doing. When we talk about the Semantic Web™, we don't just mean that the tags convey semantics to us humans; they hold specific meanings for browsers, which can in turn communicate their meanings out to assistive technologies. But the browser can't do this with non-standard tags, because those tags don't have any defined meanings or behaviors for the browser to communicate.
Custom elements, properly defined with the
To reiterate, the Semantic Web isn't just a matter of preference or style or convenience. It has a huge direct impact on the lives of many, many users, those who rely on assistive technology. And assistive technologies in turn rely on the semantics they can parse from the HTML to help those users.
If you've never tried to use the web with a screen reader before, please do. I think every web developer needs to do this periodically in order to better understand how many of their users interact with the web, and how honestly horrible a lot of the web can be for users who rely on assistive tech. On a site built without any semantics, basically all
<span>s and even non-standard elements, the best a screen-reader can do is read the text top to bottom, with no way to let the user easily navigate the page. But mark it up with semantic elements and the screen-reader can give the user an outline of the page, highlight the important landmarks, and give them handy jump points to move around the page a hundred times more easily.
In my experience, there's a tragic lack of attention paid to semantics in web development training, and this knowledge gap actively hurts the users that need it. That's a big part of why I write articles about semantic web technologies and techniques. I want to help spread this information as broadly as possible and make sure every web dev possible finds out about it. Semantic HTML, microdata, ARIA, all these Semantic Web tools are crucial for building a web that's made for everyone. Accessibility should become as foundational to web development training as the difference between a
<div> and a
This stuff directly impacts the lives of many people, much more directly than we often think or talk about. And that's the big point: because this stuff isn't talked about enough, because it isn't taught in HTML 101 as a foundational part of the platform, we forget about it, and we forget about the users that need it. And not only are they forgotten, but when they are remembered, far too often they are intentionally ignored. I've heard and read too many stories from developers that know and care about accessibility of joining a team, asking about the accessibility shortcomings of the application, and recommending improvements, but being shut down because it would take too much development time and effort. I've experienced this myself.
And in honesty, they aren't wrong to say that there's a cost to training an established team on accessibility principles and techniques, and updating a large existing codebase to be more accessible. But that's just another huge reason that accessibility should be taught from the start! There's no cost to building an accessible application from the start, but there is an initial cost to retro-fitting it later. But the fact that it costs something should by no means stop anyone from doing it. If you need business motivations: (1) you'll have access to more users, which means more potential customers, (2) your code will be more maintainable, and (3) you'll have better SEO. But also, it's just the right thing to do, and I don't think we should always frame it in profit terms.
Ultimately, what I'm trying to do by writing about web semantics and accessibility is to play some small part in improving the experiences of one of the most underserved, forgotten, and ignored groups of users on the web.
- Accessibility in HTML5 - an incredible introduction to semantic HTML, the ARIA standard, and general web accessibility concepts and best practices; strong recommendation for anyone writing HTML, it's a great refresher even for experienced devs
- How Expensive is Web Accessibility? - a really interesting and practical discussion of how to add accessibility to an existing codebase in an effective (and cost-effective) way
- html5accessibility.com - implementation status of accessibility features for each element included in HTML5
- Accessibility APIs: A Key To Web Accessibility - a Smashing Mag article discussing the basics of how the browser exposes semantics with assistive tech, and how we as developers can take this further with browser APIs
- ARIA Landmark Roles - Landmarks are the most basic, foundational, and easy-to-understand piece of how the browser interprets HTML semantics. Semantic elements introduced in HTML5 often have these roles by default (see "Accessibility in HTML5" link above), so we don't usually need to manually assign roles if we use them
- ARIA Landmarks Example - A set of examples from the W3C demonstrating how to effectively use ARIA Landmarks