0

I have a YAML file like the following:

- workload:
    name: cloud1
    param:
      p1: v1
      p2: v2

- workload:
    name: cloud2
    param:
      p1: v1
      p2: v2

I can parse the file using the following Python script:

#!/usr/bin/env python

import yaml   

try:
 for key, value in yaml.load(open('workload.yaml'))['workload'].iteritems():
   print key, value
except yaml.YAMLError as out:
  print(out)

output:

name cloud1
param {'p1': 'v1'}

But I'm looking for is something like:

workload1 = cloud1
workload1_param_p1 = v1
workload1_param_p2 = v2

workload2 = cloud2
workload2_param_p1 = v1
workload2_param_p2 = v2
5
  • Isn't that Yaml a tad incorrect? Shouldn't workload be a list? Commented Nov 2, 2016 at 13:17
  • Something like workloads: - workload: cloud1 param: p1: v1 p2: v2 - workload: cloud2 param: p1: v1 p2: v2 Commented Nov 2, 2016 at 13:19
  • you are right, tnx. I'll correct it. Commented Nov 2, 2016 at 13:21
  • I think this question is mislabeled as it has nothing to do with YAML. You have a dictionary and want that data in some other form. Wether the dictionary came from parsing a YAML file or anything else entirely isn't relevant. Commented Nov 2, 2016 at 15:31
  • How about simply using the pprint.pprint function? Commented Nov 6, 2016 at 10:18

1 Answer 1

1

Your output doesn't match your input as the toplevel of your YAML file is a sequence that maps to a Python list.
The other thing not entirely clear is where the workload and especially the 1 in workload1 come from. In the following I have assumed they come from the key of the mapping that constitutes the sequence elements resp. the position of that sequence element (starting at 1, hence the idx+1). The name is popped from a copy of the values, so that the rest can be recursively dumped correctly:

import sys
import ruamel.yaml

yaml_str = """\
- workload:
    name: cloud1
    param:
      p1: v1
      p2: v2

- workload:
    name: cloud2
    param:
      p1: v1
      p2: v2
"""

data = ruamel.yaml.round_trip_load(yaml_str)

def dump(prefix, d, out):
    if isinstance(d, dict):
        for k in d:
            dump(prefix[:] + [k], d[k], out)
    else:
        print('_'.join(prefix), '=', d, file=out)

for idx, workload in enumerate(data):
    for workload_key in workload:
        values = workload[workload_key].copy()
        # alternatively extract from values['name']
        workload_name = '{}{}'.format(workload_key, idx+1)
        print(workload_name, '=', values.pop('name'))
        dump([workload_name], values, sys.stdout)
    print()

gives:

workload1 = cloud1
workload1_param_p1 = v1
workload1_param_p2 = v2

workload2 = cloud2
workload2_param_p1 = v1
workload2_param_p2 = v2

This was done using ruamel.yaml, a YAML 1.2 parser, of which I am the author. If you only have YAML 1.1 code (as supported by PyYAML) you should still use ruamel.yaml as its round_trip_loader guarantees that your workload_param_p1 is printed before workload_param_p2 (with PyYAML that is not guaranteed).

Sign up to request clarification or add additional context in comments.

1 Comment

For the last part, well that's a dictionary, the whole point is you normally don't care the order the keys are in. And for what it's worth, it is totally possible to have PyYAML spit out OrderedDicts, it takes about 3 lines of code to make it do so. — Note: I have nothing against your parser, it's a great piece of software, it's just this passage is both unfair and unrelated to the question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.