1

I have this XML file and I'd like to read some data out of it using Python's xml.etree :

<a>
   <b>
      <AuthorName>
         <GivenName>John</GivenName> 
         <FamilyName>Smith</FamilyName>
      </AuthorName>
      <AuthorName>
         <GivenName>Saint</GivenName> 
         <GivenName>Patrick</GivenName>
         <FamilyName>Thomas</FamilyName>
      </AuthorName>
   </b>
</a>

The result that I wish to have is this :

John Smith
Saint Patrick Thomas

The thing, as you may have noticed, is that sometimes I have 1 GivenName tag and sometimes I have 2 GivenName tags

What I did was this :

from xml.etree import ElementTree as ET
xx = ET.parse('file.xml')
authorName = xx.findall('.//AuthorName')
for name in authorName:
    print(name[0].text + " " + name[1].text)

It works fine with 1 GivenName tag but not when I have 2.

What can I do?

Thanks!

2 Answers 2

2

Try this:

from xml.etree import ElementTree as ET
xx = ET.parse('file.xml')
authorName = xx.findall('.//AuthorName')
for name in authorName:
    nameStr = ' '.join([child.text for child in name])
    print(nameStr)

You have to look at all child tags inside authorName, take their text and then join them to your nameStr.

Sign up to request clarification or add additional context in comments.

1 Comment

Note, the thing inside the join that reads [child.text for child in name] is called a list comprehension and deserves to be read up on. They are incredibly useful.
1

It appears that you aren't really making use of your loop. Something like this might work a bit better for you:

from xml.etree import ElementTree as ET
xx = ET.parse('file.xml')
authorName = xx.finall('.//AuthorName')

nameParts = []
for name in authorName:
    fullName.append(name)

fullName = ' '.join(nameParts)

print(fullName)

Now, one more thing that you can do here to make your life a bit easier is learn about list comprehensions. For example, the above can be reduced to:

from xml.etree import ElementTree as ET
xx = ET.parse('file.xml')
authorName = xx.finall('.//AuthorName')

fullName = ' '.join((name.text for name in xx.findall('.//AuthorName')))
print(fullName)

Note: This has not actually been tested to run. There may be typos.

1 Comment

ah! you beat me to it. But the thing is you have two loops to iterate through, one is the authorName tags loop, and then iterate through the children of authorName tag.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.