HTML is the most common markup language for web development. HTML is a superset of XML, which is to say it is an extension of the XML specification. What is cool about this fact is that web browsers, in their ability to render HTML, actually come with XML parsers, and have XML parsing capabilities under the hood.
Why Think About XML At All
HTML is the ubiquitous markup language of internet developers. The audience of this blog, software engineers, likely only has need for HTML. Yet, my Media Company deals with many authors of the non-technical variety, and I have got to say... Authors think about their content wayyy differently than HTML gives credit for.
The beauty of XML is its generic stucture which allows for custom parsing and handling. This flexibiliy has been beautifully exemplified in HTML, but the use case of allowing custom definitions is better handled by XML.
XML is a data-carrying language. HTML is an extension of that language that comes with standardized graphical-user interface rendering. To see what I mean by this, open an XML file in a browser. https://alexason.com/uploads/library.xml
As you will see, modern browsers render the file complete with element tags. But also take note that the browser recognizes the datatype, and applies special formatting. In this way, XML is more like JSON.
Parsing XML
While not native to browser rendering agents, it's possible to parse XML using the browser API's DOMParser.
See a gist of this is action
const xmlString = `
<story>
<styles>
<titleStyle>
<color>#4A90E2</color>
</titleStyle>
<paragraphStyle>
<color>#333333</color>
</paragraphStyle>
</styles>
<title>Elena and the Embrace of Holiness</title>
<paragraph>In the heart of the village, where the sun kissed the earth...</paragraph>
<!-- More paragraphs here -->
</story>`;
const parser = new DOMParser();
const xmlDocument = parser.parseFromString(xmlString, "text/xml");
const parserError = xmlDoc.getElementsByTagName("parsererror");
if (parserError.length > 0) {
// Handle error
console.error("Error parsing XML:", parserError[0].textContent);
} else {
// Successfully parsed the XML
// XML Document contains a document
console.log("Parsed XML Document:", xmlDocument);
const title = xmlDocument.getElementsByTagName("title")[0].textContent;
const titleColor = xmlDocument.getElementsByTagName("color")[0].textContent;
}
Real Use Case
The example shown demonstrates what is possible with XML, yet the use case of rendering and styling content is better handled by HTML. While the format, resembles HTML, using XML as HTML must not be the best case of XML.
My HTML Developer I know, Israel, writes XML like this. He uses the data format to recreate HTML, then uses JavaScript to make it HTML. While this is possible given the flexibility of XML, if the only use case is for the browser, I'll tell you what I tell Israel: "Just write HTML!"
Join Israel and the HTML Devs at Salvation.
Where to use XML
XML is a great format for intermediate representation. As mentioned, the immediate use case of my company is translating many different Author's (book authors, manuscript writers) representation of their work into a standardized format. The task is to turn Word documents, PDFs, plaintext, and spoken words into some similar data format.
XML could do that, and is exactly used as such in software programs such as Calibre and Manuskript.
This has been a look at XML. It is a widely-recognized format, compatible with many readers and conversion tools. Given it's ease of parsing, W3C recommendation, and ubiquity, XML is a safe language for indefinite data storage.
If you're interested in tools for data science and storage, be sure to Follow this Dev.to. Add a reaction 💖 for more content like this.
A
Top comments (0)