1

I am starting to play around with Python but hit a wall in xml.

I am trying to edit a xml sub-element that maybe is stored in a not very conventional way, it has many numbers as text and more like a vector with all sub-elements named as 'double' although it is actually text...then I was expecting the xml to be.

This is a example of such file

<simulation>
         <element1>'A'</element1>
         <element2>
              <subelement1>
                   <double>1</double>
                   <double>2</double>
                   <double>3</double>
                   <double>4</double>
                   <double>5</double>
              </subelement1>
              <subelement2>
                   <double>1</double>
                   <double>2</double>
                   <double>3</double>
                   <double>4</double>
                   <double>5</double>
              </subelement2>
         </element2>
</simulation>

What I want to do is to change all child nodes values from subelement1 for let's say: 10, 20, 30, 40, 50 having something like this in the end:

<simulation>
         <element1>'A'</element1>
         <element2>
              <subelement1>
                   <double>10</double>
                   <double>20</double>
                   <double>30</double>
                   <double>40</double>
                   <double>50</double>
              </subelement1>
              <subelement2>
                   <double>1</double>
                   <double>2</double>
                   <double>3</double>
                   <double>4</double>
                   <double>5</double>
              </subelement2>
         </element2>
</simulation>

Can access all the nodes that I want to change with this:

import xml.etree.ElementTree as ET

for elem in root:
    for subelem in elem.findall('.//element1/double'):
        print(subelem.attrib)
        print(subelem.text)

This shows the numbers I want to change (see below), but I could not find a way to actually change them to the ones I need.

{} 1    {} 2    {} 3    {} 4    {} 5

If I try to use it as a vector or something like this:

for elem in root:
    for subelem in elem.findall('.//element1/double'):
        subelem.text = [10,20,30,40,50]
        print(subelem.text)

I end up not substituting, but adding information and the results are:

{} 1 [10,20,30,40,50]
{} 2 [10,20,30,40,50]
{} 3 [10,20,30,40,50]
{} 4 [10,20,30,40,50]
{} 5 [10,20,30,40,50]

What would be a way to make the changes? Thank you very much.

1
  • 1
    I needed to tweak your example (its "subelement1") and it printed [10,20,30,40,50] as expected, not {} 1 [10,20,30,40,50]. I think the code posted is a bit different than the code generating the example. Commented Feb 10, 2020 at 6:55

2 Answers 2

2

Assignment to the element's text attribute replaces the value, it doesn't append. There must have been something wrong in your test code.

You need to make sure you assign a string. ET will accept a number or a list, any object really, but will crash later when you try to serialize the tree. Also, there is no need to enumerate the first level of elements before findall, .// tells it to search the entire subtree.

    import xml.etree.ElementTree as ET
    
    xmltext = """<simulation>
             <element1>'A'</element1>
             <element2>
                  <subelement1>
                       <double>1</double>
                       <double>2</double>
                       <double>3</double>
                       <double>4</double>
                       <double>5</double>
                  </subelement1>
                  <subelement2>
                       <double>1</double>
                       <double>2</double>
                       <double>3</double>
                       <double>4</double>
                       <double>5</double>
                  </subelement2>
             </element2>
    </simulation>"""
    
    root = ET.fromstring(xmltext)
   
    # to apply a function to each text node
    #for subelem in root.findall('.//subelement1/double'):
    #    subelem.text = str(int(subelem.text) * 10)

    # to replace a known number of text nodes
    for subelem in root.findall('.//subelement1/[double]'):
        new_doubles = [10, 20, 30, 40 ,50]
        for elem, dbl in zip(subelem.findall('double'), new_doubles):
            elem.text = str(dbl)
        break

    print(ET.tostring(root, encoding="utf-8").decode('utf-8'))

Prints

    <simulation>
             <element1>'A'</element1>
             <element2>
                  <subelement1>
                       <double>10</double>
                       <double>20</double>
                       <double>30</double>
                       <double>40</double>
                       <double>50</double>
                  </subelement1>
                  <subelement2>
                       <double>1</double>
                       <double>2</double>
                       <double>3</double>
                       <double>4</double>
                       <double>5</double>
                  </subelement2>
             </element2>
    </simulation>
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you @tdelaney , however I have expressed myself badly. I don't want to multiply the whole thing for 10, but change the values on it. ex.: changing 1 to 11, 2 to 0, 3 to 0.1, 4 to 40, and 5 to 10.5. I've used tree = ET.parse('my_big_file') root = tree.getroot() to get the file and root.
@Jonathan - I updated to use a known list. I also fiddled with the xpath so that you only get "double" elements under a single "sublelement1" node and break once you've processed the first set.
0

Here is a solution with iter() and pop():

from lxml import etree

xml_string = """<simulation>
             <element1>'A'</element1>
             <element2>
                  <subelement1>
                       <double>1</double>
                       <double>2</double>
                       <double>3</double>
                       <double>4</double>
                       <double>5</double>
                  </subelement1>
                  <subelement2>
                       <double>1</double>
                       <double>2</double>
                       <double>3</double>
                       <double>4</double>
                       <double>5</double>
                  </subelement2>
             </element2>
    </simulation>"""

root = etree.fromstring(xml_string)

xpath_expr = '//subelement1/double'
new_value =["10","20","30","40","50"]

# make the changes

for elem in root.iter():
    if elem in elem.xpath(xpath_expr):
        elem.text = new_value.pop(0)

etree.indent(root, space='  ')
etree.dump(root)

Output:

<simulation>
  <element1>'A'</element1>
  <element2>
    <subelement1>
      <double>10</double>
      <double>20</double>
      <double>30</double>
      <double>40</double>
      <double>50</double>
    </subelement1>
    <subelement2>
      <double>1</double>
      <double>2</double>
      <double>3</double>
      <double>4</double>
      <double>5</double>
    </subelement2>
  </element2>
</simulation>

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.