1

I would like to change/add a custom subelement to an xml which was generated by my script.

The top element is AAA:

top = Element('AAA')

The collected_lines looks like this:

[['TY', ' RPRT'], ['A1', ' Peter'], ['T3', ' Something'], ['ER', ' ']]

Then I enumerate all lines one-by-one and create a SubElement for top:

for line in enumerate(collected_lines):
 child = SubElement(top, line[0])
 child.text = line[1]

Output:

<?xml version="1.0" ?>
<AAA>
  <TY> RPRT</TY>
  <A1> Peter</A1>
  <T3> Something</T3>
  <ER> </ER>
  <TY> RPRT2</TY>
  <A1> Peter</A1>
  <T3> Something2</T3>
  <ER> </ER>
  <TY> RPRT2</TY>
  <A1> Peter</A1>
  <T3> Something2</T3>
  <ER> </ER>
</AAA>

And I would like to add <ART> element to the top element and then print the xml like this:

<?xml version="1.0" ?>
<AAA>
  <ART>
   <TY> RPRT</TY>
   <A1> Peter</A1>
   <T3> Something</T3>
   <ER> </ER>
  </ART>
  <ART>
   <TY> RPRT2</TY>
   <A1> Peter</A1>
   <T3> Something2</T3>
   <ER> </ER>
  </ART>
  <ART>
   <TY> RPRT2</TY>
   <A1> Peter</A1>
   <T3> Something2</T3
  </ART>
</AAA>

I'm tried to do it with an if statemant. Like:

if "TY" in line:
 "append somehow before TY element, <ART>"
if "ER" in line:
 "append somehow after ER element, </ART>"

Is there a simple way to solve this?

1 Answer 1

1

Just reassign the top element and use insert:

top = ET.Element('AAA')
# by the way you need index, element on enumerate
for i, line in enumerate(collected_lines):
    child = ET.SubElement(top, line[0])
    child.text = line[1]

art = top
art.tag = 'ART'
top = ET.Element('AAA')
top.insert(1, art)

ET.tostring(top)
'<AAA><ART><TY> RPRT</TY><A1> Peter</A1><T3> Something</T3><ER> </ER></ART></AAA>'

As @twasbrillig pointed out, you don't even need enumerate, just a simple for/loop will do:

...
for line in collected_lines:
    child = ET.SubElement(top, line[0])
    child.text = line[1]
...

Another update

OP edited to also ask how to handle multiple sections as in previous example, this can be achieved by normal Python logic:

import xml.etree.ElementTree as ET

s = '''<?xml version="1.0" ?>
<AAA>
  <TY> RPRT</TY>
  <A1> Peter</A1>
  <T3> Something</T3>
  <ER> </ER>
  <TY> RPRT2</TY>
  <A1> Peter</A1>
  <T3> Something2</T3>
  <ER> </ER>
  <TY> RPRT2</TY>
  <A1> Peter</A1>
  <T3> Something3</T3>
  <ER> </ER>
</AAA>'''

top = ET.fromstring(s)
# assign a new Element to replace top later on
new_top = ET.Element('AAA')
# get all indexes where TY, ER are at
ty = [i for i,n in enumerate(top) if n.tag == 'TY']
er = [i for i,n in enumerate(top) if n.tag == 'ER']
# top[x:y] will get all the sibling elements between TY, ER (from their indexes)
nodes = [top[x:y] for x,y in zip(ty,er)]

# then loop through each nodes and insert SubElement ART
# and loop through each node and insert into ART
for node in nodes:
    art = ET.SubElement(new_top, 'ART')
    for each in node:
        art.insert(1, each)
# replace top Element by new_top
top = new_top

# you don't need lxml, I just used it to pretty_print the xml    
from lxml import etree
# you can just ET.tostring(top)
print etree.tostring(etree.fromstring(ET.tostring(top)), \
          xml_declaration=True, encoding='utf-8', pretty_print=True)
<?xml version='1.0' encoding='utf-8'?>
<AAA>
  <ART><TY> RPRT</TY>
  <T3> Something</T3>
  <A1> Peter</A1>
  </ART>
  <ART><TY> RPRT2</TY>
  <T3> Something2</T3>
  <A1> Peter</A1>
  </ART>
  <ART><TY> RPRT2</TY>
  <T3> Something3</T3>
  <A1> Peter</A1>
  </ART>
</AAA>
Sign up to request clarification or add additional context in comments.

8 Comments

In fact you don't even need enumerate at all. Can just do for line in collected_lines: etc.
@twasbrillig, good call! I just try to be using OP's code as possible. But yes I will update the answer and provide an alternative :)
What if I have several sections starting with <TY> and ending with <ER>, how can I paste these blocks into <ART> ?
@hukiz, your scenario isn't on OP. could you please update so I can update to answer?
Thanks, I did some small modification, but it works!
|