1. How to Load YAML in Python? | Article Overview and Target Audience
To Those Who Want to Work with YAML in Python
When developing applications or tools in Python, you’ll increasingly encounter situations where you want to use the YAML format for configuration files or external data management. In particular, YAML—more readable than JSON and allowing simple syntax—is a highly popular data format among engineers and data scientists. For example, you’ll need to read YAML in the following scenarios:
Web apps or scripts externalize configuration files
Parse Docker Compose and Kubernetes configuration files with Python
Manage machine learning framework parameters with YAML
On the other hand, many beginners say things like “I don’t know how to read a YAML file” or “I get errors and can’t read it properly.”
What You’ll Learn in This Article
In this article, we’ll explain how to safely and reliably load YAML files in Python in a way that’s easy for beginners to understand. Specifically, we’ll cover the following points:
Basic structure and characteristics of YAML files
How to load in Python (using safe_load())
Common errors and how to address them
Reading multiple documents and practical examples with configuration files
We’ll also discuss security considerations and the often‑overlooked differences between load() and safe_load(). Finally, we’ll include a FAQ section, so your questions should be answered.
Target Audience
This article is intended for the following readers:
Beginners to intermediate users who want to try YAML with Python
Developers who need to work with configuration files
Those who are unsure about how to use PyYAML
People who want to learn more about safe_load() and error handling
If you’re looking to streamline your Python development in the future, mastering how to work with YAML is a big step forward. In the following sections, we’ll walk through an overview of YAML and how to handle it in Python step by step. Let’s start with “What is YAML?”
2. What is YAML? | Simple Comparison of Differences and Features with JSON
What is YAML?
YAML (pronounced “Yamel” or “Yamle”) is a recursive acronym for “YAML Ain’t Markup Language” (YAML is not a markup language), designed primarily to make structured data easy for humans to read and write. It works well with programming languages such as Python and Ruby, and is widely used in configuration files and data exchange scenarios. YAML expresses hierarchical structure through indentation, and its major feature is that it enables simple and intuitive writing.
Differences with JSON
YAML is used for similar purposes as JSON, but there are several clear differences between them. Below we compare some representative items.
Comparison Item
YAML
JSON
Readability
High (human‑friendly)
Medium (machine‑friendly)
Comment support
Possible (using #)
Not possible
File size
Tends to be smaller (fewer symbols)
Slightly larger
Data structure representation
Higher flexibility (complex structures are OK)
Arrays and objects are central
Extensibility
High (custom structures can be defined)
Limited
Support status
Some limitations
Widely supported
Benefits of YAML
Using YAML has the following benefits:
Intuitive syntax: similar to Python indentation, making the structure easy to grasp
Comments can be written: convenient for adding notes to configuration files
Not redundant: no need for braces or double quotes like in JSON
Human‑friendly: readable and editable even by non‑engineers
Typical Use Cases for YAML
YAML is often used in the following tools and systems:
Docker Compose (docker-compose.yml)
Kubernetes configuration files (definitions of Pods and Services)
CI/CD tools (GitHub Actions, GitLab CI, etc.)
Machine learning libraries (such as PyTorch Lightning and Hydra)
Web app and script configuration files
In other words, being able to work with YAML is a powerful skill in modern development environments.
3. Preparing to Work with YAML in Python | Installing PyYAML
What is PyYAML?
To read or write YAML files in Python, it is common to use the external library “PyYAML”. PyYAML is a simple yet powerful library based on the YAML 1.1 specification, and because it is not included in Python’s standard library, it needs to be installed separately. Using PyYAML allows you to treat YAML files as Python dictionaries (dict) or lists (list). This makes reading and writing configuration files and manipulating structured data intuitive.
How to Install PyYAML
Installing PyYAML is very straightforward. You can install it from the command line (or terminal) using pip as shown below.
pip install pyyaml
※If pip is not available in your environment, using python -m pip install pyyaml is also fine.
Using a Virtual Environment is Recommended
If you want to keep your development environment isolated, it is recommended to install within a virtual environment (venv or conda). This makes managing library versions easier when handling multiple projects.
# Create a virtual environment
python -m venv venv
# Activate the virtual environment
# On Windows
venvScriptsctivate
# On macOS/Linux
source venv/bin/activate
# Install PyYAML
pip install pyyaml
How to Verify the Installation
After installation, you can verify that the library is correctly imported by writing the following in Python’s interactive mode (REPL) or a script.
import yaml
print(yaml.__version__)
If no error occurs, PyYAML has been installed successfully. Checking the version can also help with future troubleshooting.
4. Basic: How to Read YAML Files in Python (Using safe_load)
Most basic loading method: safe_load()
When reading YAML files with PyYAML, the most commonly used function is safe_load(). This function is designed to safely load YAML and can retrieve the loaded data as Python dictionaries (dict) or lists (list). First, let’s look at a basic YAML file and its loading code.
Thus, YAML files can be treated directly as native Python data structures, allowing smooth use in subsequent processing.
open() function: specifying encoding is important
When dealing with YAML files that contain Japanese (or other non-ASCII characters), be sure to specify encoding='utf-8' in the open() function. Omitting this can cause garbled text on Windows and similar environments.
One point: using the with statement
When reading files, using the with statement, e.g., with open(...) as f:, helps prevent file‑handle leaks and ensures safe processing. This is the recommended best‑practice style in Python.
5. Differences between safe_load and load | Cautions for loading YAML in Python
safe_load() and load(): What’s the difference?
PyYAML provides several functions for loading YAML files, but the most confusing are the differences between safe_load() and load(). At first glance, both functions read YAML into Python data structures, but there are significant differences in security and functionality. Using the wrong one can expose you to the risk of executing malicious code from external YAML files, so it’s important to understand and use them correctly.
safe_load() Features (Safe Loading)
import yaml
with open('config.yaml', 'r', encoding='utf-8') as f:
data = yaml.safe_load(f)
Basic function to use
High security (does not load arbitrary Python objects)
Limited to basic data types (dicts, lists, strings, numbers, etc.)
Raises an error when attempting to load unknown types or objects
safe_load() is, as its name suggests, a function for “safe loading”, and when dealing with configuration files or external data, it’s the best choice in most cases.
load() Features (Flexible but Risky)
import yaml
with open('config.yaml', 'r', encoding='utf-8') as f:
data = yaml.load(f, Loader=yaml.FullLoader)
Can interpret YAML more flexibly
Can reconstruct Python objects (e.g., functions, class instances, etc.)
Because of security risks, specifying a Loader is required
In older versions of PyYAML, you could use load() on its own, but now an explicit Loader specification is required. If you omit it, you’ll get warnings or errors like the following:
yaml.YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated
This is because load() historically contained a vulnerability that could execute arbitrary Python objects. In other words, loading a malicious YAML file could unintentionally execute Python code.
Types of Loaders (Differences in Security and Flexibility)
Loader Name
Features
Recommendation
SafeLoader
Loads only basic types (safe)
◎
FullLoader
Allows more Python types (security caution)
○
UnsafeLoader
Allows loading arbitrary objects (dangerous)
×
Conclusion: Usually use safe_load()
If you’re only reading YAML files as configuration files or external data, safe_load() is sufficient—and you should use it. Reserve load() for special cases (e.g., when you need to deserialize custom Python objects).
6. Common Errors and Their Solutions | Pitfalls of Loading YAML in Python
Why You Might Stumble When Loading YAML
YAML is a very simple and readable format, but that also means it is strict about fine-grained syntax rules. Especially when loading it with Python, there are several points that beginners often overlook. In this section, we will specifically introduce>common errors when loading YAML files with Python, their causes, and how to resolve them.
1. Syntax Errors Caused by Indentation Mistakes
In YAML, indentation is a crucial element that indicates structure. Mismatched spaces or tabs will cause errors during loading.
Example: YAML with Indentation Error
user:
name: Alice
age: 30
Example Error Message:
yaml.scanner.ScannerError: mapping values are not allowed here
Solution:
Use a consistent indentation of 2–4 spaces (do not use tabs).
Indent child elements deeper than their parent elements.
2. Garbled Text Caused by File Encoding
If a YAML file contains non-ASCII characters such as Japanese, failing to specify the correct encoding will result in garbled text or decode errors.
with open('config.yaml', 'r', encoding='utf-8') as f:
data = yaml.safe_load(f)
Explicitly specifying encoding='utf-8' resolves the issue.
This is especially easy to forget on Windows environments.
3. Incorrect File Path / Missing File
It’s a simple mistake, but if the specified YAML file does not exist, a FileNotFoundError occurs.
Error Example:
FileNotFoundError: [Errno 2] No such file or directory: 'config.yaml'
Solution:
Verify that the path is specified as an absolute path or a correct relative path.
Check for typos in the filename or extension.
Ensure the file is in the same directory as the script.
4. safe_load Returns None
This can happen even when the YAML syntax is correct, if the file is empty or contains only comments.
Example:
# This file contains no settings
Result:
data = yaml.safe_load(f)
print(data) # Output: None
Solution:
Make sure the file contains valid YAML data.
Returning None for a file with only comments or whitespace is normal behavior.
5. Parse Errors in Overly Complex YAML Structures
In large, deeply nested YAML files, syntax mistakes or incorrect use of anchors/aliases can also cause errors.
Solution:
Validate the syntax incrementally (verify small sections at a time).
Catching yaml.YAMLError exceptions and inspecting the detailed messages is recommended.
try:
with open('config.yaml', 'r', encoding='utf-8') as f:
data = yaml.safe_load(f)
except yaml.YAMLError as e:
print(f"YAML loading error: {e}")
Summary: When Errors Occur, Calmly Check Syntax and Environment
YAML is a very easy-to-use format as long as you pay a little attention to its syntax. If you encounter errors, check the following points:
Whether the indentation is correct (no tabs used).
Whether the file encoding is UTF-8.
Whether the file exists.
Whether the YAML content is correctly written.
7. Advanced: How to Load YAML Files with Multiple Documents (safe_load_all)
YAML can contain multiple documents in a single file
One of YAML’s major features is that multiple data blocks (documents) can be defined within a single file. This allows you to split settings and configurations while managing them together in one file. Documents are explicitly separated by --- (three hyphens).
Example of a multi-document YAML file (multi.yaml)
# Server configuration 1
---
server:
host: localhost
port: 8080
# Server configuration 2
---
server:
host: example.com
port: 443
A YAML file written like this contains two independent data blocks, each of which needs to be read separately.
How to use yaml.safe_load_all()
In PyYAML, the safe_load_all() function is provided to read multiple documents. It returns all YAML documents in a file as an iterator (iterable object).
import yaml
with open('multi.yaml', 'r', encoding='utf-8') as f:
documents = yaml.safe_load_all(f)
for doc in documents:
print(doc)
In this way, each document can be obtained as a dictionary, and you can flexibly use them by processing in a loop.
Note the difference from safe_load()
The regular safe_load() reads only the first document, so it is not suitable for files that contain multiple documents.
Function name
Supported YAML format
Return type
safe_load()
Single document only
Data (dictionary, list, etc.)
safe_load_all()
Supports multiple documents
Iterator (for loop processing)
You can also convert the loaded documents into a list
In some cases, you may want to retrieve all documents as a single list. In that case, you can simply use list() as shown below.
with open('multi.yaml', 'r', encoding='utf-8') as f:
documents = list(yaml.safe_load_all(f))
This allows you to process them in bulk as a list or access them by index.
Note: Not all YAML files support multiple documents
Using safe_load_all() on a YAML file that is not separated by --- is fine, but the result will be a single document. In other words, safe_load_all() is versatile, but if the file doesn’t contain multiple documents, it behaves the same as safe_load(). Keep this in mind.
8. Supplement: Key Points for Using YAML as a Configuration File
Why is YAML suitable for configuration files?
YAML combines a syntax that is easy for humans to read and write with flexible expression of nested structures, which is why many projects adopt it as a “configuration file.” Especially for applications and tools developed in Python, using YAML makes the configuration more transparent and improves maintainability. YAML is especially useful for the following purposes:
Environment configuration for web applications (production / development / testing)
Hyperparameter settings for machine learning
Switching script behavior
Management of external dependencies such as API keys (note: be careful with handling confidential information)
Example of a practical YAML configuration file
app:
name: MyApp
mode: production
logging:
level: INFO
file: logs/app.log
database:
host: localhost
port: 5432
user: admin
password: secret
By organizing components into sections like this, the configuration file becomes easy to understand at a glance for anyone. Unlike JSON, the ability to freely add comments is also extremely important in practice.
Hierarchical configuration using nested structures
Because YAML uses indentation to represent hierarchy, even complex configurations can be expressed intuitively. For example, you can organize settings per environment as shown below:
If the Python side reads this configuration, it can automatically select the appropriate settings based on environment variables and the like.
You can clarify intent with comments
In YAML you can write comments using #. This provides the major advantage of being able to record directly in the configuration file information such as “why this setting is chosen” or “when it should be changed.”
# Application mode setting (one of dev, test, production)
mode: production
Because JSON does not support comments, such annotations cannot be added.
Code example for loading YAML at runtime (advanced)
Below is an example of a Python script loading a YAML configuration and applying it to an app:
import yaml
with open('settings.yaml', 'r', encoding='utf-8') as f:
config = yaml.safe_load(f)
app_name = config['app']['name']
mode = config['app']['mode']
db_host = config['database']['host']
In this way, by treating a YAML file as a Python dictionary you can manage configuration values without hard‑coding them directly in the code.
Caution: Handle confidential information carefully
While YAML is user‑friendly, it is also stored in plain text. Therefore, when information as API keys or passwords in a YAML file, the following measures are necessary:
Prevent committing to the repository with .gitignore
Combine with files like .env and keep only references in YAML
Apply encryption and access restrictions
9. Frequently Asked Questions (FAQ)
In this section, we address common questions and beginner pitfalls when loading YAML files in Python in a Q&A format. The content is also useful when applying it to real projects, so be sure to check it out.
Q1. Between YAML and JSON, which is easier to handle in Python?
A. YAML has high readability and is very convenient for use as a configuration file. Python also supports JSON via the standard library, but many find YAML easier for configuration files because you can add comments, its structure is more readable, and overall YAML feels more user‑friendly for config files. However, in scenarios where processing speed or data‑exchange compatibility is critical, JSON may be preferred.
Q2. Should you avoid using yaml.load()?
A. As a rule, using safe_load() is the safe choice.load() is highly flexible, but it can reconstruct arbitrary Python objects, which poses a security risk. Loading a malicious YAML file could execute unintended code, so it is generally recommended to use safe_load(). If you must use load(), explicitly specify a safe loader such as Loader=yaml.FullLoader and implement with security in mind.
Q3. Why does the loaded YAML content become empty (None)?
A. This is normal behavior that occurs when the YAML file is empty or contains only comments.
# This file has not been configured yet
When you load a file like the above, the return value of safe_load() is None. This is not an error; it represents valid “empty data” in YAML. Make sure the file contents are correctly written.
Q4. Can you reuse values in a YAML file (like variables)?
A. YAML provides a mechanism called “anchors” and “aliases.”
By using & to define an anchor and * to reference an alias, you can reuse the same configuration in multiple places. However, with PyYAML, this syntax may require specific versions or loader settings, so verify its behavior in advance.
Q5. Japanese characters become garbled on Windows. How can I fix this?
A. Specify the encoding when reading the file to resolve the issue.
with open('config.yaml', 'r', encoding='utf-8') as f:
data = yaml.safe_load(f)
Windows’ default character set (cp932) may not correctly read YAML files written in UTF‑8. Always specify encoding='utf-8'.
Q6. How should I read a YAML file that is split into multiple configuration blocks?
A. Using safe_load_all() allows you to load multiple documents.
---
app: App1
port: 3000
---
app: App2
port: 4000
Files like this can be processed one document at a time with yaml.safe_load_all():
with open('multi.yaml', 'r', encoding='utf-8') as f:
for doc in yaml.safe_load_all(f):
print(doc)
10. Summary | Master YAML Loading in Python
Reviewing YAML Loading from Basics to Advanced
In this article, we have provided a step‑by‑step guide for those who want to read YAML files in Python, covering everything from basic usage and common error handling to handling multiple documents and using YAML as configuration files. To recap, here are the key points of the article:
Installing PyYAML: To work with YAML in Python, you first need pip install pyyaml.
Basic loading: Using yaml.safe_load() lets you safely and easily load YAML into dictionaries or lists.
Error handling: Indentation and encoding mistakes are common issues. Syntax checking and encoding='utf-8' are important.
Difference from load(): If safety is a priority, you should use safe_load(). load() requires specifying an appropriate Loader.
Handling multiple documents: With safe_load_all(), you can flexibly process multiple configuration blocks within a single file.
Practicality as a configuration file: YAML’s readability and flexibility make it ideal for configuration management in Python projects.
Next steps: Leverage YAML even more
Now that you’ve mastered reading YAML, moving on to the following steps will help you apply it more effectively in real‑world work:
Writing to YAML files: Automatic generation of configuration files using yaml.dump()
Bidirectional conversion with JSON: Handy when integrating with web APIs or external services
Combining with .env files or environment variables: Building a more secure configuration management approach
Automating configuration loading: A mechanism that dynamically loads the appropriate configuration file based on the environment when the app starts
Mastering these will make Python development more flexible and reusable. We’ll continue to clearly present Python techniques and tool usage that are useful in the field. Please bookmark this page and refer back to it often. That concludes our guide on reading YAML files with Python.
Acquire knowledge you can apply in real work, and feel free to incorporate YAML into your own projects.