DEV Community

Cover image for Why is < meta charset="utf-8" > important?
Maggie
Maggie

Posted on • Updated on

Why is < meta charset="utf-8" > important?

I'm currently participating in the #100DaysOfCode challenge and documenting my journey on Twitter. So far, I've been reviewing the holy trifecta of web development: HTML, CSS, and JavaScript. On Day 4, I shared that one of the things I reviewed was the importance of including <meta charset="utf-8"> in an HTML file.

I got a response asking to explain why. As I was typing my answer, I found that I had a lot to say to fit into one tweet, and it would be easier to write up a blog post.

What is <meta charset="utf-8">?

Let's break down the line <meta charset="utf-8"> to derive its meaning:

  • <meta> is a HTML tag that contains metadata about a web page, or more specifically, descriptors that tell search engines what type of content a web page contains that is hidden from display.
  • charset is an HTML attribute that defines the character encoding for your browser to use when displaying the website content.
  • utf-8 is a specific character encoding.

In other words, <meta charset="utf-8"> tells the browser to use the utf-8 character encoding when translating machine code into human-readable text and vice versa to be displayed in the browser.

Why 'utf-8'?

Today, more than 90% of all websites use UTF-8. Before UTF-8 became the standard, ASCII was used. Unfortunately, ASCII only encodes English characters, so if you used other languages whose alphabet does not consist of English characters, the text wouldn't be properly displayed on your screen.

For example, say I wanted to display some Arabic text that says "Hello World!" on a screen using the following snippet of code with the charset set equal to ascii:

<!DOCTYPE html>
<html>
<head>
  <meta charset="ascii"> <!-- char encoding is set equal to ASCII -->
</head>
<body>
  <h1>!مرحبا بالعالم</h1>
</body>
</html>
Enter fullscreen mode Exit fullscreen mode

Now, if you go to your browser, you'll see that the text is displayed as gibberish 🥴:
"Hello World!" in Arabic using ASCII charset

However, if we change the charset to utf-8, the code is as follows:

<!DOCTYPE html>
<html>
<head>
  <meta charset="utf-8"> <!-- char encoding is set equal to UTF-8 -->
</head>
<body>
  <h1>!مرحبا بالعالم</h1>
</body>
</html>
Enter fullscreen mode Exit fullscreen mode

The text is now displayed properly 🥳:

"Hello World!" in Arabic using UTF-8 charset

Thus, UTF-8 was created to address ASCII's shortcomings and can translate almost every language in the world. Because of this and its backward compatibility with ASCII, almost all browsers support UTF-8.

What if I forget to include <meta charset="utf-8"> in my HTML file?

Don't worry — HTML5 to the rescue! 🦸

The default character encoding used in HTML5 is UTF-8. This means if you include <!DOCTYPE html> at the top of your HTML file (which declares that it's an HTML5 file), it'll automatically use UTF-8 unless specified otherwise.

Furthermore, most browsers use UTF-8 by default if no character encoding is specified. But because that's not guaranteed, it's better to just include a character encoding specification using the <meta> tag in your HTML file.

There you have it. 🎉 Feel free to leave any comments or thoughts below. If you want to follow my #100DaysOfCode journey, follow me on Twitter at @maggiecodes_. Happy coding!

Top comments (9)

Collapse
 
ashleyjsheridan profile image
Ashley Sheridan

One thing I've always been told was important about this was to make it the very first tag in the <head> section to prevent browsers needing to stop and reparse the html if they guessed the encoding wrongly.

Collapse
 
maggiecodes_ profile image
Maggie

Good point!

Collapse
 
stlee987 profile image
stlee987 • Edited

The meta charset element is only about the characters that you can enter in the HTML file. If you have <meta charset="ascii">, it means you should only enter ASCII characters in your HTML file. You can still display Arabic text or any other language using HTML entities, although this is cumbersome. For example, with <meta charset="ascii">, you can use this instead for your Arabic text.

<h1>!&#x645;&#x631;&#x62D;&#x628;&#x627; &#x628;&#x627;&#x644;&#x639;&#x627;&#x644;&#x645;</h1>

While I don't recommend this since it's not readable, just note that some older editors might not support UTF-8 files.

Collapse
 
cypress_l profile image
CypressLiu

Very helpful, thanks

Collapse
 
favouritejome profile image
Favourite Jome

Nicely explained, thanks for taken your time. Understood now 🙂

Collapse
 
thecodingjunkey profile image
Koohi the Coder

While searching for information about "," I found your account and appreciated the explanations, which were far clearer than those I found on Wikipedia.

Collapse
 
sajjadnazaridev profile image
mr_sajjad_dev

Is it important to be big or small UTF-8?

Collapse
 
kadincool profile image
kadin

So, if I'm understanding correctly... this is more of a compatibility thing?

Collapse
 
maggiecodes_ profile image
Maggie

Yep. Not sure if this answers your question but utf-8 solves one of the limitations of ASCII of not being able to encode non-English characters (like those in Arabic and Korean for example). This way languages that don't use English characters can still be understood by computers and also display the text properly to users.