DEV Community

Discussion on: Should I "close" <img > and other HTML tags?

Collapse
mr13 profile image
Mr.13

Switch to HTML5. Period.

Collapse
alex_arriaga profile image
Alex Arriaga Author • Edited on

Hi @Mr.13, thanks for reading and commenting!

Switching to HTML5 would allow us to have tags such as <img> without the need to close them, the issue that I have personally faced is with parsers that are no related to web browsers, tools such as iText parse XML trees really fast, however, they start to throw exceptions with non-well created trees.

We could fix the HTML to convert it into a well-structured tree with tools such as jsoup, however, that would increase the processing time. This is not necessarily an issue when processing small sites, but if we need to process more than half million of pages in a short time each millisecond of processing counts.

Having said that, I agree with you, in most scenarios just switching to HTML5 will be fine.

Regards.

Collapse
mr13 profile image
Mr.13

I agree with you on this but still confused Why would you parse that many pages?
Even for search engine indexing we need meta data only.

Suggestions:
Try some C/C++/Rust based parser.

Collapse
tbroyer profile image
Thomas Broyer • Edited on

I was about to point you to hixie.ch/advocacy/xhtml as a rebuttal to your proposal to "close" void tags, but it looks like it actually (kind of) agrees with you on that point.

Just make sure the people you work with don't mistake HTML5 for XML and then produce <script src="…" /> or CDATA sections, and at the same time also stick to your rule of being XML-compatible (which also means always quoting attributes, never omitting optional tags, etc.)

At work, I ask coworkers to not use />, we're doing HTML, not XML, and the clearer you make the better IMO.