YAML is a data serialization language that allows you to store complex data in a compact and readable format. It's important for DevOps and virtualization because it's essential in making efficient data management systems and automation.
While often overlooked by developers, it's a powerful and simple tool that can greatly improve your job prospects with just a couple of hours of learning.
Today, we'll help you learn YAML fast with a hands-on tutorial and explore how you can use it in your next data-driven solution.
Here’s what we’ll cover today:
Add a YAML certificate to your resume
Get hands-on experience and earn official YAML certification in less than an hour that you can add to your resume, LinkedIn profile, or your personal website.
What is YAML?
YAML is a data serialization language for storing information in a human-readable form. It originally stood for "Yet Another Markup Language" but has since been changed to "YAML Ain't Markup Language" to distinguish itself as different from a true markup language.
It is similar to XML and JSON files but uses a more minimalist syntax even while maintaining similar capabilities. YAML is commonly used to create configuration files in Infrastructure as Code (IoC) programs or to manage containers in the DevOps development pipeline.
More recently, YAML has been used to create automation protocols that can execute a series of commands listed in a YAML file. This means your systems can be more independent and responsive without additional developer attention.
As more and more companies embrace DevOps and virtualization, YAML is quickly becoming a must-have skill for modern developer positions. YAML is also easy to incorporate with existing technologies through the support of popular technologies like Python using PyYAML library, Docker, or Ansible.
YAML vs JSON vs XML
YAML (.yml):
- Human-readable code
- Minimalist syntax
- Solely designed for data
- Similar inline style to JSON (is a superset of JSON)
- Allows comments
- Strings without quotation marks
- Considered the "cleaner" JSON
- Advanced features (extensible data types, relational anchors, and mapping types preserving key order)
Use Case: YAML is best for data-heavy apps that use DevOps pipelines or VMs. It's also helpful for when other developers on your team will work with this data often and therefore need it to be more readable.
JSON
- Harder to read
- Explicit, strict syntax requirements
- Similar inline style to YAML (some YAML parsers can read JSON files)
- No comments
- Strings require double quotes
Use Case: JSON is favored in web development as it's best for serialization formats and transmitting data over HTTP connections.
XML
- Harder to read
- More verbose
- Acts as a markup language, while YAML is for data formatting
- Contains more features than YAML, like tag attributes
- More rigidly defined document schema
Use Case: XML is best for complex projects that require fine control over validation, schema, and namespace. XML is not human-readable and requires more bandwidth and storage capacity, but offers unparalleled control.
Salient Features of YAML
Here are some of the best features YAML has to offer.
Multi-document support
You can have multiple YAML documents in a single YAML file to make file organization or data parsing easier.
The separation between each document is marked by three dashes (---
)
---
player: playerOne
action: attack (miss)
---
player: playerTwo
action: attack (hit)
--------
Built-in commenting
YAML allows you to add comments to files using the hash symbol (#
) similar to Python comments.
key: #Here is a single-line comment
- value line 5
#Here is a
#multi-line comment
- value line 13
Readable syntax
YAML files use an indentation system similar to Python to show the structure of your program. You're required to use spaces to create indentation rather than tabs to avoid confusion.
It also cuts much of the "noise" formatting found in JSON and XML files such as quotation marks, brackets, and braces.
Together, these formatting specifications increase the readability of YAML files beyond XML and JSON.
Imaro:
author: Charles R. Saunders
language: English
publication-year: 1981
pages: 224
{
"Imaro": {
"author": "Charles R. Saunders",
"language": "English",
"publication-year": "1981",
"pages": 224,
}
}
Notice that the same information is conveyed; however, the removal of double quotes, commas, and brackets throughout the YAML file makes it much easier to read at a glance.
Implicit and Explicit typing
YAML offers versatility in typing by auto-detecting data types while also supporting explicit typing options. To tag data as a certain type, simply include !![typeName]
before the value.
# The value should be an int:
is-an-int: !!int 14.10
# Turn any value to a string:
is-a-str: !!str 67.43
# The next value should be a boolean:
is-a-bool: !!bool yes
No executable commands
As a data-representation format, YAML does not contain executables. It's therefore very safe to exchange YAML files with external parties.
YAML must be integrated with other languages, like Perl or Java, to add executables.
Keep the learning going.
Earn a YAML certificate in less than an hour. Educative's hands-on courses let you pick up the essential skills and certifications you need to stand out to top recruiters.
YAML Syntax
YAML has a few basic concepts that make up the majority of data.
Key-value pairs
In general, most things in a YAML file are a form of key-value pair where the key represents the pair's name and the value represents the data linked to that name. Key-value pairs are the basis for all other YAML constructions.
<key>: <value>
Scalars and mapping
Scalars represent a single stored value. Scalars are assigned to key names using mapping. You define a mapping with a name, colon, and space, then a value for it to hold.
YAML supports common types like integer and floating-point numeric values, as well as non-numeric types Boolean and String.
Each can be represented in different ways, like hexadecimal, octal, or exponent. There are also special types for mathematical concepts like infinity, -infinity, and Not a Number (NAN
)
integer: 25
hex: 0x12d4 #evaluates to 4820
octal: 023332 #evaluates to 9946
float: 25.0
exponent: 12.3015e+05 #evaluates to 1230150.0
boolean: Yes
string: "25"
infinity: .inf # evaluates to infinity
neginf: -.Inf #evaluates to negative infinity
not: .NAN #Not a Number
String
Strings are a collection of characters that represent a sentence or phrase. You either use |
to print each string as a new line or >
to print it as a paragraph.
Strings in YAML do not need to be in double-quotes.
str: Hello World
data: |
These
Newlines
Are broken up
data: >
This text is
wrapped and is a
single paragraph
Sequence
Sequences are data structures similar to a list or array that hold multiple values under the same key. They're defined using a block or inline flow style.
Block style uses spaces to structure the document. It's easier to read but is less compact compared to flow style.
--------
# Shopping List Sequence in Block Style
shopping:
- milk
- eggs
- juice
Flow style allows you to write sequences inline using square brackets, similar to an array declaration in a programming language like Python or JavaScript.
Flow style is more compact but harder to read at a glance.
--------
# Shopping List Sequence in Flow Style
shopping: [milk, eggs, juice]
Dictionaries
Dictionaries are collections of key-value pairs all nested under the same subgroup. They're helpful to divide data into logical categories for later use.
Dictionaries are defined like mappings in that you enter the dictionary name, a colon, and a space followed by 1 or more indented key-value pairs.
# An employee record
Employees:
- dan:
name: Dan D. Veloper
job: Developer
team: DevOps
- dora:
name: Dora D. Veloper
job: Project Manager
team: Web Subscriptions
Dictionaries can contain more complex structures as well, such as sequences. Nesting sequences is a good trick to represent complex relational data.
Advanced concepts to learn next
Congratulations on taking your first step toward learning YAML. While often overlooked, YAML is a simple and effective tool to pick up for your DevOps toolkit.
Some next advanced topics to look at are:
- Anchors
- Templates
- YAML with external tools (Docker, Ansible, etc.)
- Advanced sequence/mapping types
- Advanced data types (timestamp, null, etc.)
To help you pick up YAML fast, Educative has created the course Introduction to YAML. This mini-course covers all YAML syntax in-depth from simple mappings to advanced anchoring techniques.
After less than an hour, you'll have cracked all the essential YAML skills and earned a YAML certification for your DevOps resume.
Happy learning!
Top comments (3)
I don't agree that YAML is easier to read than JSON or XML. Especially when the file becomes larger, or when you use of non-basic YAML features.
Of the three, YAML is the only one where slight errors produce correct files, but wrong results. What I comprehend might not be the same as what the computer reads.
I'm not a beginner or a stupid one, but I did not get YAML syntax - until now, this is a good introduction.
Anyway, whenever I need a non-trivial data format, I choose XML. It's verbose, not sexy, not too comfortable to write - but if anyone takes a look on it, he or she will instantly understand the XML file.
For trivial data, I prefer
category.sub.name = value
XML is ‘harder to read’ .. really. XML is a generalised syntax and easily as expressive as YAML to create a DSL, or use one of the many industry defined schema.
XML is verbose ... that rather depends on whether you use element normal form or the more compact attribute syntax. Regardless, verbosity can be viewed as an orthogonal concern to readability, so I’m not really sure whether you are stating it as an advantage or disadvantage (how you view that depends on your use case imho)