1

I am using python beautiful soup to parse an XML file and write it to a different file after deletion of certain tags. But, using soup.prettify changes other XML namespaces and attribute names.

f = open('new.xml',"w"); f.write(soup.prettify(formatter="xml")); f.close();

The changes are as given in sample below.

Original XML file.

<draw:control text:anchor-type="paragraph" draw:z-index="1" draw:style-name="gr1" draw:text-style-name="P2" svg:width="2.805cm" svg:height="1.853cm" svg:x="3.602cm" svg:y="0.824cm" draw:control="control2"/>

New XML file written from soup.prettify.

  <draw:control draw:control="control2" draw:style-name="gr1" draw:text-style-name="P2" draw:z-index="1" svg:height="1.853cm" svg:width="2.805cm" svg:x="3.602cm" svg:y="0.824cm" text:anchor-type="paragraph"/>

I tried adding utf-8 to prettify(). But, its the same problem. Is there any other method to delete a particular tag based on searching and keep all the other XML contents in the file in tact? Please suggest.

1
  • That's only changing the order of the attributes. It's not changing the names. Commented May 9, 2014 at 7:17

1 Answer 1

3

Consider using native xml.etree.ElementTree module which implements a simple and efficient API for parsing and creating XML data. Its faster, better, easier and pythonic.

You can remove a particular element using Element.remove().

A basic example is given here.

But if you insist on using BeautifulSoup (it uses lxml, a enhanced version of native py module) , you can

# beautifulstonesoup for XML parsing
from BeautifulSoup import BeautifulStoneSoup 

xml_data = """
<draw:control text:anchor-type="paragraph" draw:z-index="1" draw:style-name="gr1" draw:text-style-name="P2" svg:width="2.805cm" svg:height="1.853cm" svg:x="3.602cm" svg:y="0.824cm" draw:control="control2"/>
"""
soup = BeautifulStoneSoup(xml_data)
print soup.prettify()
soup.find(<your tag/element).replaceWith(<whateveryouwant>)

You can also use a for loop for editing multiple similar elements as well.

2
  • If you are satisfied with the answer, please vote it and marked as solved. Thanks. Commented May 9, 2014 at 8:33
  • FIY, as far as I know xml.etree doesn't have a function to prettify output. Commented Mar 17, 2016 at 17:48

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.