The first time I came across YAML was around a year ago when I use it to write OpenAPI definitions to document a RESTful API using Swagger API Documentation and, to be honest, I really hated it.
Being a JSON "fan", the YAML syntax felt weird and unnatural to me, so for a while, I didn't pay any attention to it.
This changed a few months ago, when I started to get into CI/CD, since both Azure and GitLab pipelines require a YAML file to setup. So I finally decided to properly learn about YAML, and after doing some reading I found the ideas behind it fascinating.
In this article I'll cover the basics of YAML, including its main goals, basic syntax and some of its more complex features.
Table of contents
Introduction
YAML is a data-serialization language often used for configuration files, such as Open API specifications or CI/CD pipelines.
Fun fact! 🤓
According to YAML 1.0 specification document (2001-05-26) the acronym "YAML" stands for "Yet Another Markup Language", but it was later changed to the recursive acronym "YAML Ain't Markup Language" in the 2002-04-07 specification.
As stated in the latest spec YAML is designed to be friendly to people working with data and achieves "unique cleanness" by minimizing the use of structural characters, allowing the data to appear in a natural and meaningful way.
The latest spec also states that YAML 1.2 is in compliance with JSON as an official subset, meaning that most JSON documents can be parsed to YAML.
YAML achieves easy inspection of data's structures by using indentation-based scoping (similar to Python).
Another fun fact! 🤓
DEV.to articles use YAML to define custom variables like title, description, tags, etc.
Basic Syntax
YAML documents are basically a collection of key-value pairs where the value can be as simple as a string or as complex as a tree.
Here are a few notes about YAML syntax:
- Indentation is used to denote structure. Tabs are not allowed and the amount of whitespace doesn't matter as long as the child node is more indented than the parent.
- UTF-8, UTF-16 and UTF-32 encodings are allowed.
Strings
# Strings don't require quotes:
title: Introduction to YAML
# But you can still use them:
title-w-quotes: 'Introduction to YAML'
# Multiline strings start with |
execute: |
npm ci
npm build
npm test
The above code will translate to JSON as:
{
"title": "Introduction to YAML",
"title-w-quotes": "Introduction to YAML",
"execute": "npm ci\nnpm build\nnpm test\n"
}
Numbers
# Integers:
age: 29
# Float:
price: 15.99
# Scientific notation:
population: 2.89e+6
The above code will translate to JSON as:
{
"age": 29,
"price": 15.99,
"population": 2890000
}
Boolean
# Boolean values can be written in different ways:
published: false
published: False
published: FALSE
All of the above will translate to JSON as:
{
"published": false
}
Null values
# Null can be represented by simply not setting a value:
null-value:
# Or more explicitly:
null-value: null
null-value: NULL
null-value: Null
All of the above will translate to JSON as:
{
"null-value": null
}
Dates & timestamps
ISO-Formatted dates can be used, like so:
date: 2002-12-14
canonical: 2001-12-15T02:59:43.1Z
iso8601: 2001-12-14t21:59:43.10-05:00
spaced: 2001-12-14 21:59:43.10 -5
Sequences
Sequences allow us to define lists in YAML:
# A list of numbers using hyphens:
numbers:
- one
- two
- three
# The inline version:
numbers: [ one, two, three ]
Both of the above sequences will parse to JSON as:
{
"numbers": [
"one",
"two",
"three"
]
}
Nested values
We can use all of the above types to create an object with nested values, like so:
# Nineteen eighty four novel data.
nineteen-eighty-four:
author: George Orwell
published-at: 1949-06-08
page-count: 328
description: |
A Novel, often published as 1984, is a dystopian novel by English novelist George Orwell.
It was published in June 1949 by Secker & Warburg as Orwell's ninth and final book.
Which will translate to JSON as:
{
"nineteen-eighty-four": {
"author": "George Orwell",
"published-at": "1949-06-08T00:00:00.000Z",
"page-count": 328,
"description": "A Novel, often published as 1984, is a dystopian novel by English novelist George Orwell.\nIt was published in June 1949 by Secker & Warburg as Orwell's ninth and final book.\n"
}
}
List of objects
Combining sequences and nested values together we can create a lists of objects.
# Let's list books:
- nineteen-eighty-four:
author: George Orwell
published-at: 1949-06-08
page-count: 328
description: |
A Novel, often published as 1984, is a dystopian novel by English novelist George Orwell.
- the-hobbit:
author: J. R. R. Tolkien
published-at: 1937-09-21
page-count: 310
description: |
The Hobbit, or There and Back Again is a children's fantasy novel by English author J. R. R. Tolkien.
Distinctive Features
The following are some more complex features that caught my attention and that also differentiate YAML from JSON.
Comments
As you've probably already noticed in my prior examples, YAML allows comments starting with #
.
# This is a really useful comment.
Reusability with Node Anchors
Node anchors mark a node for future reference, which allow us to reuse the node. To mark a node we use the &
character, and to reference it we use *
:
In the following example we'll define a list of books and reuse the author data, so we only have to define it once:
# The author data:
author: &gOrwell
name: George
last-name: Orwell
# Some books:
books:
- 1984:
author: *gOrwell
- animal-farm:
author: *gOrwell
The above code will look like this once parsed to JSON:
{
"author": {
"name": "George",
"last-name": "Orwell"
},
"books": [
{
"1984": {
"author": {
"name": "George",
"last-name": "Orwell"
}
}
},
{
"animal-farm": {
"author": {
"name": "George",
"last-name": "Orwell"
}
}
}
]
}
Explicit data types with tags
As we've seen in previous examples, YAML autodetects the type of our values, but it's possible to specify which type we want.
We specify the type by including it before the value preceded by !!
.
Here are some examples:
# The following value should be an int, no matter what:
should-be-int: !!int 3.2
# Parse any value to string:
should-be-string: !!str 30.25
# I need the next value to be boolean:
should-be-boolean: !!bool yes
This will translate to JSON as:
{
"should-be-int": 3,
"should-be-string": "30.25",
"should-be-boolean": true
}
Conclusion
Reading and writing about YAML, and experimenting with it was super interesting.
What I like: I specially loved to read about the goals of YAML in relation to code cleanness and readability, and how it achieves that. I also feel better about properly learning the syntax at last 😅.
What I don't like: I don't like that I need a parser (which means installing a new dependency) to use YAML with the main technologies I work with (node.js and .NET Core).
However, I will now consider YAML, specially if I need something that JSON can't cover like reusability, explicit types or comments. I'm sure that working with pipelines will be easier now too.
Also, I'd strongly recommend reading YAML 1.2 Specification document (3rd review) - Introduction to learn more about YAML goals, origins and relationship with other languages.
What are you using YAML for? 💬
Are you using YAML? For what? What are your thoughts about it?
Top comments (38)
Thanks for all the tips and tricks. I like using YAML for configurations.
PS: YAML has some default casting one should be aware of:
Thank you! According to the YAML 1.2 specification document 'yes' and 'no' are no longer interpreted as boolean.
You can use !!bool to parse them, though.
Thanks for this! I just got started with GitHub Actions a couple of days ago, and was making a lot of assumptions on what the YAML was representing -- the translations to JSON you've done here are really helpful :)
Thanks Darren, I'm glad I could help! 🙂
I used YAML file to configure Cluster group in the Pipeline during my internship this past Summer. It was a bit challenging since this was the first time using it but definitely easy to work with. Thanks for sharing this great article.
Nice! Was it your first time working with it? And did its readability made it easier to pickup?
I use YAML for my default config files with tools such as ESLint and Stylelint. I find the syntax a lot more intuitive than JSON and I am less likely to make mistakes with it.
Thank you for making this tutorial, Paula.
Nice! Do you mind sharing a bit more about how you use it? Like which language and do you use a parser?
I'm really curious because I love the syntax, but I don't like the idea of including extra dependencies just for that.
I currently use yaml only when a node js package offers built-in support for it. I stick with json, otherwise. I don't know much about parsers as I've only used babel before and it doesn't support yaml, as far as I know.
I use YAML so often on Jekyll.. and I didn't know I could write multiline strings just by add this
|
. Amazing. Thanks.Something I forgot to include about strings is that you can also write multiline strings that you don't want to be interpreted as multiline. For example:
And this is how it'll look like in JSON:
When using the
>
character, instead of|
, each new line will be interpreted as an empty space.I'm glad! Thanks for reading :D
Hi, Paula. Great article!
I have one question, in JSON we can easily create a list of objects without “naming” them, like this:
[{“a”: “b”}, {“x”: “y”}]
How we do that in YAML? For example, a list of Authors (I don’t wanna have [{“authors” : {authoObj1}}, {“authors” : {authoObj2}}])
Great question!
You can achieve that by entering the hyphen first and then the properties in a new line, like so:
Which will translate to JSON as:
Also here's a nice online tool I've been using to try the YAML syntax and see its JSON counterpart.
I used YAML for my uni assignment where we had to build a ci/cd pipeline using Travis CI for a spring boot application. Travis uses a YAML file for configuration and I found it very easy and intuitive to work with.
Nice! Every CI pipeline config I've seen so far uses YAML. I believe we'll be seeing more of it in the near future.
Thanks for this helpful article. I'm using it in K8s to setup the values files and sort of understood the basics of YAML - but this was legit lightbulb. I don't know why I didn't consider that it can be translated to JSON! Thanks again
Thanks for all the tips and tricks!!
I discovered Yaml accidentally, I am learning star and I met him, and also with the link to your article and this beautiful community, this is for me a day of pleasant discoveries.
I ran into YAML when working with AWS CloudFormation. It beat me up like a drum 😒
Believe me, I know the feeling!😂
One of the things that made me change my mind after reading about it for a bit was that the whole concept behind YAML made me remember how I felt after going from XML to JSON, which was basically "wow, so much less code, I can actually read this!".