Table of Contents
Introduction
YAML is a human friendly data format that appears everywhere in modern DevOps work on Linux. You will see it in CI pipelines, container orchestration, configuration management, and many cloud services. In this chapter you will learn the essentials you need to read and write simple YAML files correctly, without diving into tool specific features that belong in later chapters.
What YAML Is and Where You See It
YAML stands for “YAML Ain’t Markup Language.” In practice it is a structured text format used to describe data like settings, lists of tasks, or definitions of resources. It is similar in purpose to JSON, but focuses on being easy to read and edit by humans.
On Linux in a DevOps context you commonly see YAML in files such as GitHub Actions workflows, GitLab CI configurations, Ansible playbooks, Kubernetes manifests, and many cloud infrastructure tools. Each tool has its own structure, but they all rely on the same basic YAML building blocks.
Basic Structure: Indentation and Key Value Pairs
YAML uses indentation to show structure. There are no brackets like [] or {} in plain YAML. Nested data is created by indenting lines with spaces.
A simple mapping of keys to values looks like this:
name: my-app
version: "1.0"
debug: trueEach line is a key followed by a colon and then a value. Indentation is used when one key contains another mapping.
For example:
server:
host: "127.0.0.1"
port: 8080
Here server is a key. Its value is another mapping, containing host and port. The indentation tells YAML that host and port belong inside server.
Always use spaces for indentation in YAML, never tab characters. A common convention is 2 spaces per level, but 4 spaces also works as long as you are consistent.
Scalars: Basic Value Types
YAML values that are not collections are called scalars. They include strings, numbers, booleans, and special null values.
Unquoted text with no special characters is usually treated as a string. For example:
environment: productionNumbers are written without quotes:
retries: 3
timeout_seconds: 30Booleans are written as:
debug: true
enabled: false
Null is a special “no value” marker. It can be written as null, Null, NULL, or ~:
description: nullStrings can also be quoted. Quotes are useful if the text contains spaces, colons, or characters that might be misinterpreted:
job_name: "build:frontend"
note: "This value includes spaces."If a value looks like a number or boolean but you want it treated as plain text, quote it:
version: "01"
literal_true: "true"
Without quotes 01 could be interpreted as a number and true as a boolean.
Sequences: Lists of Items
YAML sequences are ordered lists. They are represented with hyphens at the same indentation level.
For example, a list of packages:
packages:
- git
- curl
- python3
The key packages maps to a sequence. Each list item starts with - followed by a space. Indentation of the list items must be consistent.
List items can be more complex and can contain mappings themselves:
users:
- name: alice
role: developer
- name: bob
role: admin
Here users is a sequence of two mappings. Each item starts with - and the following key value pairs line up beneath that item.
Mappings: Nested Structures
Mappings are the YAML term for key value collections. You have already seen them in simple examples. YAML allows nested mappings to build tree like structures.
For example:
app:
name: demo
logging:
level: info
file: "/var/log/demo.log"
app is a mapping that contains another mapping logging. The hierarchy is expressed only by indentation.
Keys are generally strings. You write them without quotes unless they contain special characters, spaces, or start with characters that might confuse the parser. If in doubt, quote the key:
"strange key?": "value"YAML mappings do not have an inherent order. Some tools preserve the input order, but the YAML specification treats mappings as unordered.
Combining Lists and Mappings
Real configurations often combine sequences and mappings. Understanding how to read these combinations is the core practical skill with YAML.
Here is a small example:
services:
web:
image: "myapp:latest"
ports:
- "80:80"
db:
image: "postgres:16"
environment:
POSTGRES_USER: "app"
POSTGRES_PASSWORD: "secret"In this snippet:
services is a mapping with keys web and db.
web and db are each mappings that define properties of a service.
ports is a sequence, even though it currently has one item.
environment is a mapping of environment variable names to their string values.
When you read YAML, first identify whether a line belongs to a mapping or a sequence, based on whether it starts with a key or a hyphen, then follow the indentation to see what belongs together.
Strings and Multi line Text
YAML supports several ways to write strings that span multiple lines. Tool specific usage belongs to later chapters. Here you only need to be aware that there are block scalar styles.
A folded block, written with >, merges newlines into spaces, except for empty lines:
message: >
This text will be folded
into a single line
when parsed.
A literal block, written with |, preserves newlines:
script: |
echo "Hello"
echo "World"Both forms are indented under the key. The indentation of the content ends when the indentation level decreases.
Comments
YAML uses # to start a comment. Everything from # to the end of the line is ignored by the parser, except inside quoted strings.
For example:
debug: false # set to true for verbose logging
# This entire line is a commentComments are very useful in configuration files to explain the purpose of settings. In DevOps workflows and infrastructure files comments help future maintainers understand why a value was chosen.
Pitfalls and Common Errors
Many YAML errors come from simple formatting mistakes. Understanding these will save you time when tools report “invalid YAML.”
Inconsistent indentation is a frequent problem. All lines that are siblings in the structure must use the same indentation. If you start a block with 2 spaces, all lines at the same level must also have 2 spaces. Mixing 2 and 4 spaces at the same level causes errors.
Using tabs instead of spaces is another common issue. Since tabs and spaces look similar in editors, you might not notice. Configure your editor on Linux to display whitespace or to convert tabs to spaces.
Missing spaces after colons or hyphens can also break parsing. In general, you should write key: value and - item, with exactly one space after : and -. Some parsers are forgiving, but following this pattern is safer and more readable.
Accidental special values can cause confusion. For example, writing on or yes unquoted can be interpreted as booleans in some YAML versions. To be safe, quote values that might be ambiguous:
If a value could be confused with a boolean, number, or special token, write it as a quoted string. For example: on, off, yes, no, 01, 00123, and null.
Validating YAML on Linux
On a Linux system it is useful to validate YAML files before committing them or applying them with tools. Many distributions have command line linters available through the package manager or Python tools that can be installed with pip.
A typical pattern is to use a command that reads from a file and reports syntax errors. For example, you might run a tool like this:
yamllint config.yamlIf the file is valid, the tool exits silently or prints a success message. If not, it shows the line and column of the problem, which helps you find missing spaces, bad indentation, or invalid characters.
Even if you do not install a separate linter, many editors on Linux provide YAML syntax highlighting and basic validation that help catch mistakes early.
YAML Compared to JSON in Practice
Both YAML and JSON represent the same kinds of data structures: mappings, sequences, and scalars. YAML is more flexible for humans, JSON is more rigid and often better suited for machines.
Conceptually:
a YAML mapping like
config:
retries: 3
debug: falseis equivalent to the JSON:
{
"config": {
"retries": 3,
"debug": false
}
}The main differences you see are that YAML uses indentation and new lines instead of braces and commas, and it supports comments, which JSON does not.
Understanding that both represent the same tree like data helps when you need to convert between them or when you read documentation that shows JSON examples for a system that actually uses YAML on disk.
Summary
In DevOps work on Linux, reading and writing YAML is a basic and constant task. The core concepts are indentation based structure, mappings of keys to values, sequences of items, and simple scalar values like strings, numbers, and booleans.
Careful use of spaces, consistent indentation, and quoting where needed make your YAML files reliable and easier to maintain. With these basics you can confidently approach the YAML based configuration files you will encounter in continuous integration, configuration management, and cloud tooling in later chapters.