1

I tried to use the following python to parse sample file(sample.txt). But the result is unexpected.

sample:

# Summary Report #######################

System time | 2020-02-27 15:35:32 UTC (local TZ: UTC +0000)
# Instances ##################################################
  Port  Data Directory             Nice OOM Socket
  ===== ========================== ==== === ======
                                   0    0
# Configuration File #########################################
              Config File | /etc/srv.cnf
[mysqld]
server_id            = 1
port                                = 3016
tmpdir                              = /tmp
performance_schema_instrument       = '%=on'
innodb_monitor_enable               = 'module_adaptive_hash'
innodb_monitor_enable               = 'module_buffer'

[client]
port                                = 3016

# management library ##################################
jemalloc is not enabled in mysql config for process with id 2425
# The End ####################################################

code.py

import json
import re

all_lines = open('sample.txt', 'r').readlines()

final_dict = {}
regex = r"^([a-zA-Z]+)(.)+="

config = 0 # not yet found config
for line in all_lines:
    if '[mysqld]' in line:
        final_dict['mysqld'] = {}
        config = 1
        continue
    if '[client]' in line:
        final_dict['client'] = {}
        config = 2
        continue

    if config == 1 and re.search(regex, line):
        try:
            clean_line = line.strip() # get rid of empty space
            k = clean_line.split('=')[0].rstrip() # get the key
            v = clean_line.split('=')[1].lstrip()
            final_dict['mysqld'][k] = v
        except Exception as e:
            print(clean_line, e)

    if config == 2 and re.search(regex, line):
        try:
            clean_line = line.strip() # get rid of empty space
            k = clean_line.split('=')[0].rstrip() # get the key
            v = clean_line.split('=')[1].lstrip()
            final_dict['client'][k] = v
        except Exception as e:
            print(clean_line, e)

print(final_dict)
print(json.dumps(final_dict, indent=4))

with open('my.json', 'w') as f:
    json.dump(final_dict, f, sort_keys=True)

The unexpected result:

{ "client": { "port": "3016" }, "mysqld": { "performance_schema_instrument": "'%", "server_id": "1", "innodb_monitor_enable": "'module_buffer'", "port": "3016", "tmpdir": "/tmp" } }

The expected result:

{
    "client": {
        "port": "3016"
    }, 
    "mysqld": {
        "performance_schema_instrument": "'%=on'", 
        "server_id": "1", 
        "innodb_monitor_enable": "'module_buffer','module_adaptive_hash'", 
        "port": "3016", 
        "tmpdir": "/tmp"
    }
}

Is is possible to achieve the above result?

4
  • What did you miss? The only difference I'm spotting is in performance_schema_instrument . Is that the problem? Commented May 11, 2020 at 16:12
  • It looks like you're just looking to indent the json file in a more human-readable format. You almost have it - you include the indent=4 in your json.dumps command to show yourself, just also include it in the json.dump command to write out to the file (link) Commented May 11, 2020 at 16:16
  • That looks like a configuration file. Have you looked at the configparser library? This libaray parsers configuration files for you. Commented May 11, 2020 at 16:17
  • After parsing, the value of performance_schema_instrument should be "'%=on'", not "'%". Thanks. Commented May 12, 2020 at 2:30

1 Answer 1

2

The configparser is used to handle configuration file settings in python.

import configparser, re, json

regex_string         = '# Configuration File #.*?\n(\[.*?)# management library #'
configuration_string = re.findall(regex_string,open('temp').read(),re.DOTALL)[0]

c = configparser.RawConfigParser(strict=False)
c.read_string(configuration_string)

settings = {k:dict(v) for k,v in c.items() if k!='DEFAULT'}
json.dump(settings,open('temp.json','w'),sort_keys=True,indent=4)
Sign up to request clarification or add additional context in comments.

6 Comments

Is configparser module supported by python 3, not python 2.7.5? I would like to parse multiple values into the key innodb_monitor_enable. Is it possible?
I tested this on 3.7.
is it possible to export the following output? "innodb_monitor_enable": ['module_adaptive_hash', 'module_buffer']
That really isn't the intention of configparser. Normally it is frowned upon to have duplicate options, logically it makes no sense. The proper way to overwrite or have access to configurations is to have multiple config files. Like a user config file, global config file, system config file. Configparser will read all of these together following rules you set. Hence, you can overwrite what your user's set, or have your user's overwrite what you set, etc.
Obviously, you can put multiple values in your one variable, inside your configuration file, like innodb_monitor_enable = 'module_adaptive_hash','module_buffer', that is, if you are allowing the configuration file to be edited.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.