Reading an xml file using element tree

Question

I have one xml file. Its looks like,

<root>
  <Group>    
    <ChapterNo>1</ChapterNo>    
    <ChapterName>A</ChapterName>    
    <Line>1</Line>    
    <Content>zfsdfsdf</Content>    
    <Synonyms>fdgd</Synonyms>    
    <Translation>assdfsdfsdf</Translation>    
  </Group>    
  <Group>    
    <ChapterNo>1</ChapterNo>    
    <ChapterName>A</ChapterName>    
    <Line>2</Line>    
    <Content>ertreter</Content>    
    <Synonyms>retreter</Synonyms>    
    <Translation>erterte</Translation>    
  </Group>    
  <Group>    
    <ChapterNo>2</ChapterNo>    
    <ChapterName>B</ChapterName>    
    <Line>1</Line>    
    <Content>sadsafs</Content>
    <Synonyms>sdfsdfsd</Synonyms>
    <Translation>sdfsdfsd</Translation>
  </Group>
  <Group>
    <ChapterNo>2</ChapterNo>
    <ChapterName>B</ChapterName>
    <Line>2</Line>
    <Content>retete</Content>
    <Synonyms>retertret</Synonyms>
    <Translation>retertert</Translation>
  </Group>
</root>

I tried in this way.......

root = ElementTree.parse('data.xml').getroot()
ChapterNo = root.find('ChapterNo').text 
ChapterName = root.find('ChapterName').text 
GitaLine = root.find('Line').text 
Content = root.find('Content').text 
Synonyms = root.find('Synonyms').text 
Translation = root.find('Translation').text

But it shows an error

ChapterNo=root.find('ChapterNo').text 
AttributeError: 'NoneType' object has no attribute 'text'`

Now i want to get the all ChapterNo,ChapterName, etc are separately using element tree and I want to insert these dats into the database.... Any one can help me?

Rgds,

Nimmy

i tried......... root = ElementTree.parse('data.xml').getroot() ChapterNo=root.find('ChapterNo').text ChapterName=root.find('ChapterName').text GitaLine=root.find('Line').text Content=root.find('Content').text Synonyms=root.find('Synonyms').text Translation=root.find('Translation').text But is shows an error "ChapterNo=root.find('ChapterNo').text AttributeError: 'NoneType' object has no attribute 'text'" — Nimmy
– Nimmy, Commented Feb 1, 2011 at 10:02
Add that into your question, its' hard to read in a comment. — Lennart Regebro
– Lennart Regebro, Commented Feb 1, 2011 at 10:03
root.find('GitaLine') There is no text "GitaLine" in your example. — Lennart Regebro
– Lennart Regebro, Commented Feb 1, 2011 at 10:04

John Machin · Accepted Answer · 2011-02-01 11:06:05Z

To parse your simple two-level data structure and assemble a dict for each group, all you need to do is this:

>>> # what you did to get `root`
>>> from pprint import pprint as pp
>>> for group in root:
...     d = {}
...     for elem in group:
...         d[elem.tag] = elem.text
...     pp(d) # or whack it ito a database
...
{'ChapterName': 'A',
 'ChapterNo': '1',
 'Content': 'zfsdfsdf',
 'Line': '1',
 'Synonyms': 'fdgd',
 'Translation': 'assdfsdfsdf'}
{'ChapterName': 'A',
 'ChapterNo': '1',
 'Content': 'ertreter',
 'Line': '2',
 'Synonyms': 'retreter',
 'Translation': 'erterte'}
{'ChapterName': 'B',
 'ChapterNo': '2',
 'Content': 'sadsafs',
 'Line': '1',
 'Synonyms': 'sdfsdfsd',
 'Translation': 'sdfsdfsd'}
{'ChapterName': 'B',
 'ChapterNo': '2',
 'Content': 'retete',
 'Line': '2',
 'Synonyms': 'retertret',
 'Translation': 'retertert'}
>>>

Look, Ma, no xpath!

Daniel Roseman · Accepted Answer · 2011-02-01 10:42:00Z

1

ChapterNo is not a direct child of root, so root.find('ChapterNo') won't work. You'll need to use xpath syntax to find the data.

Also, there are multiple occurrences of ChapterNo, ChapterName, etc, so you should use findall and iterate through the results to get the text for each one.

chapter_nos = [e.text for e in root.findall('.//ChapterNo')]

and so on.

answered Feb 1, 2011 at 10:42

Daniel Roseman

602k68 gold badges910 silver badges923 bronze badges

1 Comment

Robert Rossney Over a year ago

Note that on a large XML document, /root/Group/ChapterNo will be faster than //ChapterNo.

nosklo · Accepted Answer · 2011-02-01 10:45:28Z

Here's a small example using sqlalchemy to define a object that will extract and store the data in a sqlite database.

from sqlalchemy import create_engine, Unicode, Integer, Column, UnicodeText
from sqlalchemy.orm import create_session
from sqlalchemy.ext.declarative import declarative_base

engine = create_engine('sqlite:///chapters.sqlite', echo=True)
Base = declarative_base(bind=engine)

class ChapterLine(Base):
    __tablename__ = 'chapterlines'
    chapter_no = Column(Integer, primary_key=True)
    chapter_name = Column(Unicode(200))
    line = Column(Integer, primary_key=True)
    content = Column(UnicodeText)
    synonyms = Column(UnicodeText)
    translation = Column(UnicodeText)

    @classmethod
    def from_xmlgroup(cls, element):
        l = cls()
        l.chapter_no = int(element.find('ChapterNo').text)
        l.chapter_name = element.find('ChapterName').text
        l.line = int(element.find('Line').text)
        l.content = element.find('Content').text
        l.synonyms = element.find('Synonyms').text
        l.translation = element.find('Translation').text
        return l

Base.metadata.create_all() # creates the table

Here's how to use it:

from xml.etree import ElementTree as etree

session = create_session(bind=engine, autocommit=False)
doc = etree.parse('myfile.xml').getroot()
for group in doc.findall('Group'):
    l = ChapterLine.from_xmlgroup(group)
    session.add(l)

session.commit()

I have tested this code in your xml data and it works fine, inserting everything into the database.

Collectives™ on Stack Overflow

Reading an xml file using element tree

3 Answers 3

Comments

1 Comment

Comments

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

1 Comment

Comments

Related