XML, JSON, & CSV.
All 3 are incredibly common “plain-text” file formats for transferring data between computers , particularly over the internet (e.g. over an “API“). But what’s the difference, and when should you use each one?
(Programmers, Salesforce business analysts, and database integration architects: you’ll want to know this!)
That’s a painful oversimplification, but I think it’s a good start if you’re completely new to the concepts.
Let’s cover a few key principles about the XML & JSON file formats.
XML & JSON are largely interchangeable with each other.
XML & JSON are able to hold more complicated data than CSV files are designed to hold.
XML & JSON files are capable of storing any “table-shaped” data that a CSV file is capable of storing (although typically not as concisely as a CSV file).
Conversely, CSV files are a stricter, simpler file format.
CSV files cannot easily store all XML/JSON-suited data.
Q: Why not?
- CSV files are lousy at storing data where each “item” of data might have different “traits” than any other “item” in the dataset.
- CSV files are lousy at “nesting” items inside of each other.
CSV files don’t transmit well over the internet in certain cases (e.g. “HTTP requests”) because they have line breaks as part of the data format specification.
Line breaks don’t always transmit well from computer to computer.
- As mentioned, neither XML nor JSON are particularly concise for storing table-shaped data. (JSON is a little “less bad.”)
If you need to see something to believe it, take a look at the output of Salesforce’s use of JSON instead of CSV when transmitting table data over the internet as follows:
- Log into the Salesforce Developer Workbench’s REST Explorer with a Salesforce dev org that has data in the “Account” table.
- Leave the HTTP method as “GET”
- Fill in the box with:
- Click “Execute.”
- Click “Show Raw Response.”
The data in the blue box on the right-hand side of the screen is formatted in the “JSON” syntax!
It’s easy to learn to write CSV files – see their Wikipedia article.
Therefore, this series is going to focus on teaching you to read and write XML and JSON files at a conceptual level.
- Ways that subtle differences between the XML & JSON standards influence programmers’ preferences.
- (Although they often choose a standard because “that’s what my coworker likes” – which is not actually a bad reason!)
- To recognize when data formatted as XML or JSON is just serving as an “internet-friendly” wrapper around “table-shaped,” CSV-like data.
- In these cases, tools exist to help you transform the data into a spreadsheet (e.g. the Python programming language’s “Pandas” module)
- The structural details of both XML and JSON formats, so that editing them won’t be scary.
I recommend reading the whole series, because I start with XML, then take a somewhat “compare-and-contrast” approach while explaining JSON to avoid repeating myself.
But before we begin, please be sure to remember this as you read:
- If your data fits naturally in an Excel spreadsheet , it’s probably “ table-shaped ” and CSV is probably its most natural format.
- Nevertheless, you can use XML or JSON (JSON will be more concise), if you need to avoid line breaks.
- If your data fits naturally in a “shopping list” or “ nested bullet points ” format, it’s probably “list of list”-shaped and XML/JSON is probably its most natural format.
We’ll have a lot of examples in this series.
I recommend that you edit them and play with seeing them in a “pretty” format!
- To view my XML examples graphically, paste them here and click “Tree View”.
- To view my JSON examples graphically, paste them here and click “Tree View”.
Warning: only put sample data into the “beautifier” links above. Never put your company’s confidential data into a stranger’s web site.
- Tables vs. Lists (this post)
- XML 1
- XML 2
- XML 3