4

I have a CSV file from which I create a list:

with open('old_id_new_id.csv', newline='') as csvfile:
    reader = csv.DictReader(csvfile, delimiter=',')
    result = [[row['oldid'],row['newid']] for row in reader]
    print(result)

this result list contains of several elements like this:

result = [['e000001_kuttenberger_religionsfrieden_tschech', 'pa000001-0020'], 
          ['e000001_kuttenberger_religionsfrieden_dt', 'pa000001-0021']]

I have an XML file of the following structure:

<struct label="Kuttenberger Religionsfrieden (1485)" order="2">
    <view file="e000001_kuttenberger_religionsfrieden_einleitung" label="Einleitung"/>
    <view file="e000001_kuttenberger_religionsfrieden_tschech" label="Quellentext"/>
    <view file="e000001_kuttenberger_religionsfrieden_dt" label="Deutsche Übersetzung"/>
</struct>

How do I open this and replace the string result[0][0] with result[0][1]:

simply put, the following doesn't work:

    with open('struct.xml', 'rb') as file:
        for line in file:
            if str(result[0][0]) in line:
                line.replace(str(result[0][0]), str(result[0][1]))

any hints?

1

1 Answer 1

2

You could build a dictionary of search terms and their replacements. Also, build a regex alternation of all search terms to be replaced. Then apply re.sub to each line with the alternation, and in a callback lookup each match in the dictionary to find the replacement.

result = (['e000001_kuttenberger_religionsfrieden_tschech', 'pa000001-0020'], ['e000001_kuttenberger_religionsfrieden_dt', 'pa000001-0021'])
terms = dict(result)
regex = r'\b(?:' + '|'.join([x[0] for x in result]) + r')\b'

with open('struct.xml', 'rb') as file:
    for line in file:
        line = re.sub(regex, lambda m: terms[m.group()], line)
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.