Edit: Still don't know why. But I did find out it only happens if you've got a weird or missing
user-agent string in your request headers.
I've been doing some research on declaring character encodings.
Specifically, do you really need the
<meta charset="UTF-8"> tag?
You must declare a character encoding, but by default most servers include this in the
http headers and that's actually better than using a
<meta> tag — the earlier it's declared the sooner the page can render.
A micro-optimisation really.
On top of that, for
utf-8 is the only valid character encoding. So
<!doctype html> is implicitly declaring the character encoding too.
<meta charset="UTF-8"> is considered sacred. So before I started telling people it's a useless
22 bytes. I thought I'd see what
In the google homepage
<head> tags they have:
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
But then in the
http headers it's:
Content-Type: text/html; charset=ISO-8859-1
What's going on here?
Here's my guesses:
- Maybe it's a backwards compatibility thing. Perhaps browsers that don't understand the
<meta>tag also don't understand
- Maybe it's a performance optimization. Perhaps it's faster to parse the very first part of the document in
ISO-8859-1then switch to
utf-8for the rest.
What do you think? What does google know that we don't (besides literally everything)?