How can I parse a YAML file in Python

Question

How can I parse a YAML file in Python?

Pat Myron · Accepted Answer · 2025-08-15 02:49:17Z

1504

The easiest method without relying on C headers is PyYaml (documentation), which can be installed via pip install pyyaml:

import yaml

with open("example.yaml") as stream:
    try:
        print(yaml.safe_load(stream))
    except yaml.YAMLError as exc:
        print(exc)

yaml.load() also exists, but yaml.safe_load() should always be preferred to avoid introducing the possibility for arbitrary code execution. So unless you explicitly need the arbitrary object serialization/deserialization use safe_load.

The PyYaml project supports versions up through the YAML 1.1 specification. If YAML 1.2 specification support is needed, see ruamel.yaml as noted in this answer.

Also, you could also use a drop in replacement for pyyaml, that keeps your yaml file ordered the same way you had it, called oyaml. View snyk of oyaml here

edited Aug 15 at 2:49

Pat Myron

4,6774 gold badges30 silver badges52 bronze badges

answered Nov 21, 2009 at 0:23

Jonathan Holloway

64k33 gold badges128 silver badges152 bronze badges

Sign up to request clarification or add additional context in comments.

14 Comments

ternaryOperator Over a year ago

I would add that unless you wish to serialize/deserialize arbitrary objects, it is better to use yaml.safe_load as it cannot execute arbitrary code from the YAML file.

MayTheSchwartzBeWithYou Over a year ago

Yaml yaml = new Yaml(); Object obj = yaml.load("a: 1\nb: 2\nc:\n - aaa\n - bbb");

SaurabhM Over a year ago

I like the article by moose: martin-thoma.com/configuration-files-in-python

Romain Over a year ago

You may need to install the PyYAML package first pip install pyyaml, see this post for more options stackoverflow.com/questions/14261614/…

naught101 Over a year ago

What's the point of capturing the exception in this example? It's going to print anyway, and it just makes the example more convoluted..

|

Martin Thoma · Accepted Answer · 2022-05-26 17:36:33Z

Read & Write YAML files with Python 2+3 (and unicode)

# -*- coding: utf-8 -*-
import yaml
import io

# Define data
data = {
    'a list': [
        1, 
        42, 
        3.141, 
        1337, 
        'help', 
        u'€'
    ],
    'a string': 'bla',
    'another dict': {
        'foo': 'bar',
        'key': 'value',
        'the answer': 42
    }
}

# Write YAML file
with io.open('data.yaml', 'w', encoding='utf8') as outfile:
    yaml.dump(data, outfile, default_flow_style=False, allow_unicode=True)

# Read YAML file
with open("data.yaml", 'r') as stream:
    data_loaded = yaml.safe_load(stream)

print(data == data_loaded)

Created YAML file

a list:
- 1
- 42
- 3.141
- 1337
- help
- €
a string: bla
another dict:
  foo: bar
  key: value
  the answer: 42

Common file endings

.yml and .yaml

Alternatives

CSV: Super simple format (read & write)
JSON: Nice for writing human-readable data; VERY commonly used (read & write)
YAML: YAML is a superset of JSON, but easier to read (read & write, comparison of JSON and YAML)
pickle: A Python serialization format (read & write) ⚠️ Using pickle with files from 3rd parties poses an uncontrollable arbitrary code execution risk.
MessagePack (Python package): More compact representation (read & write)
HDF5 (Python package): Nice for matrices (read & write)
XML: exists too *sigh* (read & write)

For your application, the following might be important:

Support by other programming languages
Reading / writing performance
Compactness (file size)

See also: Comparison of data serialization formats

In case you are rather looking for a way to make configuration files, you might want to read my short article Configuration files in Python

Thanks for suggestion. My file has utf-8 encoding. I had to change your code line to io.open(doc_name, 'r', encoding='utf8') to read the special character. YAML version 0.1.7
You can use the built-in open(doc_name, ..., encodung='utf8') for read and write, without importing io.
You use import yaml, but that isn't a built-in module, and you don't specify which package it is. Running import yaml on a fresh Python3 install results in ModuleNotFoundError: No module named 'yaml'

Benjamin Loison · Accepted Answer · 2024-08-22 23:24:38Z

If you have YAML that conforms to the YAML 1.2 specification (released 2009) then you should use ruamel.yaml (disclaimer: I am the author of that package). It is essentially a superset of PyYAML, which supports most of YAML 1.1 (from 2005).

If you want to be able to preserve your comments when round-tripping, you certainly should use ruamel.yaml.

Upgrading @Jon's example is easy:

import ruamel.yaml as yaml

with open("example.yaml") as stream:
    try:
        print(yaml.safe_load(stream))
    except yaml.YAMLError as exc:
        print(exc)

Use safe_load() unless you really have full control over the input, need it (seldom the case) and know what you are doing.

If you are using pathlib Path for manipulating files, you are better of using the new API ruamel.yaml provides:

from ruamel.yaml import YAML
from pathlib import Path

path = Path('example.yaml')
yaml = YAML(typ='safe')
data = yaml.load(path)

Hello @Anthon. I was usiing ruamel's but got an issue with documents that are not ascii compliant (UnicodeDecodeError: 'ascii' codec can't decode byte 0xe7 in position 926: ordinal not in range(128)). I've tried to set yaml.encoding to utf-8 but didn't work as the load method in YAML still uses the ascii_decode. Is this a bug?

Pal · Accepted Answer · 2020-04-18 21:16:25Z

66

First install pyyaml using pip3.

Then import yaml module and load the file into a dictionary called 'my_dict':

import yaml
with open('filename.yaml') as f:
    my_dict = yaml.safe_load(f)

That's all you need. Now the entire yaml file is in 'my_dict' dictionary.

edited Apr 18, 2020 at 21:16

answered Oct 30, 2017 at 18:27

Pal

9906 silver badges6 bronze badges

3 Comments

Anthon Over a year ago

If your file contains the line "- hello world" it is inappropriate to call the variable my_dict, as it is going to contain a list. If that file contains specific tags (starting with !!python) it can also be unsafe (as in complete harddisc wiped clean) to use yaml.load(). As that is clearly documented you should have repeated that warning here (in almost all cases yaml.safe_load() can be used).

cowlinator Over a year ago

You use import yaml, but that isn't a built-in module, and you don't specify which package it is. Running import yaml on a fresh Python3 install results in ModuleNotFoundError: No module named 'yaml'

Hans Ginzel Over a year ago

See Munch, stackoverflow.com/questions/52570869/… import yaml; from munch import munchify; f = munchify(yaml.load(…)); print(fo.d.try)

rinkush sharda · Accepted Answer · 2020-10-28 18:21:30Z

18

To access any element of a list in a YAML file like this:

global:
  registry:
    url: dtr-:5000/
    repoPath:
  dbConnectionString: jdbc:oracle:thin:@x.x.x.x:1521:abcd

You can use following python script:

import yaml

with open("/some/path/to/yaml.file", 'r') as f:
    valuesYaml = yaml.load(f, Loader=yaml.FullLoader)

print(valuesYaml['global']['dbConnectionString'])

answered Oct 28, 2020 at 18:21

rinkush sharda

3713 silver badges3 bronze badges

Comments

Benjamin Loison · Accepted Answer · 2024-08-22 23:23:39Z

13

Example:

defaults.yaml

url: https://www.google.com

environment.py

from ruamel import yaml

data = yaml.safe_load(open('defaults.yaml'))
data['url']

edited Aug 22, 2024 at 23:23

Benjamin Loison

5,7314 gold badges19 silver badges37 bronze badges

answered May 20, 2018 at 7:41

Prashanth Sams

21.4k22 gold badges107 silver badges129 bronze badges

3 Comments

droid192 Over a year ago

is it save to not close the stream?

J Kluseczka Over a year ago

I thought it is, but is it? related: stackoverflow.com/questions/49512990/…

lucidyan Over a year ago

@qrtLs It is definitely not safe. You should explicitly close the descriptor every time and this have some reasons: stackoverflow.com/a/25070939/3338479

Oleksandr Zelentsov · Accepted Answer · 2018-03-20 16:53:26Z

3

I use ruamel.yaml. ~~Details & debate here~~.

from ruamel import yaml

with open(filename, 'r') as fp:
    read_data = yaml.load(fp)

Usage of ruamel.yaml is compatible (with some simple solvable problems) with old usages of PyYAML and as it is stated in link I provided, use

from ruamel import yaml

instead of

import yaml

and it will fix most of your problems.

EDIT: PyYAML is not dead as it turns out, it's just maintained in a different place.

edited Mar 20, 2018 at 16:53

answered Jan 22, 2018 at 13:54

Oleksandr Zelentsov

674 bronze badges

4 Comments

abalter Over a year ago

@Oleksander: PyYaml has commits in the last 7 months, and the most recent closed issue was 12 days ago. Can you please define "long dead?"

Oleksandr Zelentsov Over a year ago

@abalter I apologize, seems that I got the info from their official site or the post right here stackoverflow.com/a/36760452/5510526

abalter Over a year ago

@OleksandrZelentsov I can see the confusion. There was a loooong period when it was dead. github.com/yaml/pyyaml/graphs/contributors. However, their site IS up and shows releases posted AFTER the SO post referring to PyYaml's demise. So it is fair to say that at this point it is still alive, although it's direction relative to ruamel is clearly uncertain. ALSO, there was a lengthy discussion here with recent posts. I added a comment, and now mine is the only one. I guess I don't understand how closed issues work. github.com/yaml/pyyaml/issues/145

anon Over a year ago

@abalter FWIW, when that answer was posted, there had been a total of 9 commits in the past... just under 7 years. One of those was an automated "fix" of bad grammar. Two involved releasing a barely-changed new version. The rest were relatively tiny tweaks, mostly made five years before the answer. All but the automated fix were done by one person. I wouldn't judge that answer harshly for calling PyYAML "long dead".

gil.fernandes · Accepted Answer · 2023-12-19 11:58:40Z

I would suggest to use the pyyaml library, together with the in-built pathlib library.

You will need to build a pathlib.Path object first:

from pathlib import Path
import yaml

path: Path = Path("/tmp/file.yaml")
# Make sure the path exists
assert path.exists()
# Read file and parse with pyyaml
dictionnaire = yaml.safe_load(path.read_text())

This works fine with modern versions of Python 3.

You do not need with use the open file method this way. The code is a slightly more concise and you can easily check if the file exists, before you actually try to parse the YAML file.

score 0 · Accepted Answer · 2022-05-22 11:03:05Z

I made my own script for this. Feel free to use it, as long as you keep the attribution. The script can parse yaml from a file (function load), parse yaml from a string (function loads) and convert a dictionary into yaml (function dumps). It respects all variable types.

# © didlly AGPL-3.0 License - github.com/didlly

def is_float(string: str) -> bool:
    try:
        float(string)
        return True
    except ValueError:
        return False


def is_integer(string: str) -> bool:
    try:
        int(string)
        return True
    except ValueError:
        return False


def load(path: str) -> dict:
    with open(path, "r") as yaml:
        levels = []
        data = {}
        indentation_str = ""

        for line in yaml.readlines():
            if line.replace(line.lstrip(), "") != "" and indentation_str == "":
                indentation_str = line.replace(line.lstrip(), "").rstrip("\n")
            if line.strip() == "":
                continue
            elif line.rstrip()[-1] == ":":
                key = line.strip()[:-1]
                quoteless = (
                    is_float(key)
                    or is_integer(key)
                    or key == "True"
                    or key == "False"
                    or ("[" in key and "]" in key)
                )

                if len(line.replace(line.strip(), "")) // 2 < len(levels):
                    if quoteless:
                        levels[len(line.replace(line.strip(), "")) // 2] = f"[{key}]"
                    else:
                        levels[len(line.replace(line.strip(), "")) // 2] = f"['{key}']"
                else:
                    if quoteless:
                        levels.append(f"[{line.strip()[:-1]}]")
                    else:
                        levels.append(f"['{line.strip()[:-1]}']")
                if quoteless:
                    exec(
                        f"data{''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}[{key}]"
                        + " = {}"
                    )
                else:
                    exec(
                        f"data{''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}['{key}']"
                        + " = {}"
                    )

                continue

            key = line.split(":")[0].strip()
            value = ":".join(line.split(":")[1:]).strip()

            if (
                is_float(value)
                or is_integer(value)
                or value == "True"
                or value == "False"
                or ("[" in value and "]" in value)
            ):
                if (
                    is_float(key)
                    or is_integer(key)
                    or key == "True"
                    or key == "False"
                    or ("[" in key and "]" in key)
                ):
                    exec(
                        f"data{'' if line == line.strip() else ''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}[{key}] = {value}"
                    )
                else:
                    exec(
                        f"data{'' if line == line.strip() else ''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}['{key}'] = {value}"
                    )
            else:
                if (
                    is_float(key)
                    or is_integer(key)
                    or key == "True"
                    or key == "False"
                    or ("[" in key and "]" in key)
                ):
                    exec(
                        f"data{'' if line == line.strip() else ''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}[{key}] = '{value}'"
                    )
                else:
                    exec(
                        f"data{'' if line == line.strip() else ''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}['{key}'] = '{value}'"
                    )
    return data


def loads(yaml: str) -> dict:
    levels = []
    data = {}
    indentation_str = ""

    for line in yaml.split("\n"):
        if line.replace(line.lstrip(), "") != "" and indentation_str == "":
            indentation_str = line.replace(line.lstrip(), "")
        if line.strip() == "":
            continue
        elif line.rstrip()[-1] == ":":
            key = line.strip()[:-1]
            quoteless = (
                is_float(key)
                or is_integer(key)
                or key == "True"
                or key == "False"
                or ("[" in key and "]" in key)
            )

            if len(line.replace(line.strip(), "")) // 2 < len(levels):
                if quoteless:
                    levels[len(line.replace(line.strip(), "")) // 2] = f"[{key}]"
                else:
                    levels[len(line.replace(line.strip(), "")) // 2] = f"['{key}']"
            else:
                if quoteless:
                    levels.append(f"[{line.strip()[:-1]}]")
                else:
                    levels.append(f"['{line.strip()[:-1]}']")
            if quoteless:
                exec(
                    f"data{''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}[{key}]"
                    + " = {}"
                )
            else:
                exec(
                    f"data{''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}['{key}']"
                    + " = {}"
                )

            continue

        key = line.split(":")[0].strip()
        value = ":".join(line.split(":")[1:]).strip()

        if (
            is_float(value)
            or is_integer(value)
            or value == "True"
            or value == "False"
            or ("[" in value and "]" in value)
        ):
            if (
                is_float(key)
                or is_integer(key)
                or key == "True"
                or key == "False"
                or ("[" in key and "]" in key)
            ):
                exec(
                    f"data{'' if line == line.strip() else ''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}[{key}] = {value}"
                )
            else:
                exec(
                    f"data{'' if line == line.strip() else ''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}['{key}'] = {value}"
                )
        else:
            if (
                is_float(key)
                or is_integer(key)
                or key == "True"
                or key == "False"
                or ("[" in key and "]" in key)
            ):
                exec(
                    f"data{'' if line == line.strip() else ''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}[{key}] = '{value}'"
                )
            else:
                exec(
                    f"data{'' if line == line.strip() else ''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}['{key}'] = '{value}'"
                )

    return data


def dumps(yaml: dict, indent="") -> str:
    """A procedure which converts the dictionary passed to the procedure into it's yaml equivalent.

    Args:
        yaml (dict): The dictionary to be converted.

    Returns:
        data (str): The dictionary in yaml form.
    """

    data = ""

    for key in yaml.keys():
        if type(yaml[key]) == dict:
            data += f"\n{indent}{key}:\n"
            data += dumps(yaml[key], f"{indent}  ")
        else:
            data += f"{indent}{key}: {yaml[key]}\n"

    return data


print(load("config.yml"))

Example

`config.yml`

level 0 value: 0

level 1:
  level 1 value: 1
  level 2:
    level 2 value: 2

level 1 2:
  level 1 2 value: 1 2
  level 2 2:
    level 2 2 value: 2 2

Output

{'level 0 value': 0, 'level 1': {'level 1 value': 1, 'level 2': {'level 2 value': 2}}, 'level 1 2': {'level 1 2 value': '1 2', 'level 2 2': {'level 2 2 value': 2 2}}}

it so cool! But i does not working with lists like one:\n - two\n - three
@user16779014 the github seems to not work anymore, so I can't report a bug for parsing lists. see gist.github.com/Lenormju/7b2e887693e7c1e001ba666a96bfcd25
Why reinvent the wheel? Why use a wheel that someone reinvented?

zkurtz · Accepted Answer · 2025-01-30 22:40:13Z

0

IO is simpler with dummio, a package I created. Just pip install dummio. Then

import dummio

# read
data = dummio.yaml.load(filepath)

# write
dummio.yaml.save(data, filepath=filepath)

Note that this works even if filepath is a cloud path (s3, gcs, azure). The package supports many other data types and file formats, not only dict/yaml.

answered Jan 30 at 22:40

zkurtz

3,2787 gold badges33 silver badges70 bronze badges

Collectives™ on Stack Overflow

How can I parse a YAML file in Python

10 Answers 10

14 Comments

Read & Write YAML files with Python 2+3 (and unicode)

Created YAML file

Common file endings

Alternatives

10 Comments

1 Comment

3 Comments

Comments

3 Comments

4 Comments

Comments

Example

`config.yml`

Output

3 Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

10 Answers 10

14 Comments

Read & Write YAML files with Python 2+3 (and unicode)

Created YAML file

Common file endings

Alternatives

10 Comments

1 Comment

3 Comments

Comments

3 Comments

4 Comments

Comments

Example

config.yml

Output

3 Comments

Comments

Linked

Related

`config.yml`