How to find the value in particular tag elemnet in xml using python?

Question

I am trying to parse xml data received from RESTful interface. In error conditions (when query does not result anything on the server), I am returned the following text. Now, I want to parse this string to search for the value of status present in the fifth line in example given below. How can I find if the status is present or not and if it is present then what is its value.

content = """
<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="/3.0/style/exchange.xsl"?>
<ops:world-patent-data xmlns="http://www.epo.org/exchange" xmlns:ops="http://ops.epo.org" xmlns:xlink="http://www.w3.org/1999/xlink">
    <ops:meta name="elapsed-time" value="3"/>
    <exchange-documents>
        <exchange-document system="ops.epo.org" country="US" doc-number="20060159695" status="not found">
            <bibliographic-data>
                <publication-reference>
                    <document-id document-id-type="epodoc">
                        <doc-number>US20060159695</doc-number>
                    </document-id>
                </publication-reference>
                <parties/>
            </bibliographic-data>
        </exchange-document>
    </exchange-documents>
</ops:world-patent-data>
"""
import xml.etree.ElementTree as ET
root = ET.fromstring(content)
res = root.iterfind(".//{http://www.epo.org/exchange}exchange-documents[@status='not found']/..")

peluzza · Accepted Answer · 2013-12-16 22:57:19Z

Just use BeautifulSoup:

from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup(open('xml.txt', 'r'))

print soup.findAll('exchange-document')["status"]

#> not found

If you store every xml output in a single file, would be useful to iterate them:

from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup(open('xml.txt', 'r'))

for tag in soup.findAll('exchange-document'):
    print tag["status"]

#> not found

This will display every [status] tag from [exchange-document] element.

Plus, if you want only useful status you should do:

for tag in soup.findAll('exchange-document'):
    if tag["status"] not in "not found":
        print tag["status"]

Red Alert · Accepted Answer · 2013-12-16 22:37:53Z

0

Try this:

from xml.dom.minidom import parse
xmldoc = parse(filename)
elementList = xmldoc.getElementsByTagName(tagName)

elementList will contain all elements with the tag name you specify, then you can iterate over those.

answered Dec 16, 2013 at 22:37

Red Alert

3,8162 gold badges20 silver badges24 bronze badges

Collectives™ on Stack Overflow

How to find the value in particular tag elemnet in xml using python?

2 Answers 2

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Related