2

Help, I have the following XML file that I am trying to read and extract data from, below is an extract from the xml file,

<Variable name="Inboard_ED_mm" state="Output" type="double[]">17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154<Properties><Property name="index">25</Property><Property name="description"></Property><Property name="upperBound">0</Property><Property name="hasUpperBound">false</Property><Property name="lowerBound">0</Property><Property name="hasLowerBound">false</Property><Property name="units"></Property><Property name="enumeratedValues"></Property><Property name="enumeratedAliases"></Property><Property name="validity">true</Property><Property name="autoSize">true</Property><Property name="userSlices"></Property></Properties></Variable>

I am trying to extract the following, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154

I have worked through the example here, xml.etree.ElementTree — The ElementTree XML API and I can get the example to work, but when I modify the code for the above xml, the code returns nothing!

Here is my code,

import xml.etree.ElementTree as ET
work_dir = r"C:\Temp\APROCONE\Python"

with open(model.xml, 'rt') as f:
    tree = ET.parse(f)
    root = tree.getroot()

for Variable in root.findall('Variable'):
    type = Variable.find('type').text
    name = Variable.get('name')
    print(name, type)

Any ideas? Thanks in advance for any help.

Edit: Thanks to everyone who has commented. With with your advice I have had a play and a search and got the following code,

with open(os.path.join(work_dir, "output.txt"), "w") as f:
for child1Tag in root.getchildren():
    for child2Tag in child1Tag.getchildren():
        for child3Tag in child2Tag.getchildren():
            for child4Tag in child3Tag.getchildren():
                for child5Tag in child4Tag.getchildren():
                    name = child5Tag.get('name')
                    if name == "Inboard_ED_mm":
                        print(child5Tag.attrib, file=f)
                        print(name, file=f)
                        print(child5Tag.text, file=f)

To return the following,

{'name': 'Inboard_ED_mm', 'state': 'Output', 'type': 'double[]'}
Inboard_ED_mm
17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154, 17.154

I know, not the best code in the world, any ideas on how to streamline the code would very welcome.

2
  • 1
    "below is an extract from the xml file" - The problem could be that Variable is in a default namespace. Do you have a xmlns="???" anywhere in the XML that's not shown? Commented Oct 18, 2018 at 15:49
  • @Daniel Haley, thanks for responding, sorry bit no I can not find 'xmlns' in the file. Commented Oct 19, 2018 at 7:54

2 Answers 2

3

You say the above is an "extract" of the XML file. The structure of the XML is important. Does the above just sit inside the root node?

for Variable in root.findall('Variable'):
    print(Variable.get('name'), Variable.text)

Or does it exist somewhere deeper in the XML tree structure, at a known level?

for Variable in root.findall('Path/To/Variable'):
    print(Variable.get('name'), Variable.text)

Or does it exist at some unspecified deeper level in the XML tree structure?

for Variable in root.findall('.//Variable'):
    print(Variable.get('name'), Variable.text)

Demonstrating the last two:

>>> import xml.etree.ElementTree as ET
>>> src = """
<root>
 <SubNode>
  <Variable name='x'>17.154, ..., 17.154<Properties>...</Properties></Variable>
  <Variable name='y'>14.174, ..., 15.471<Properties>...</Properties></Variable>
 </SubNode>
</root>"""
>>> root = ET.fromstring(src)
>>> for Variable in root.findall('SubNode/Variable'):
        print(Variable.get('name'), Variable.text)


x 17.154, ..., 17.154
y 14.174, ..., 15.471
>>>
>>> for Variable in root.findall('.//Variable'):
        print(Variable.get('name'), Variable.text)


x 17.154, ..., 17.154
y 14.174, ..., 15.471

Update

Based on your new/clearer/updated question, you are looking for:

for child in root.findall("*/*/*/*/Variable[@name='Inboard_ED_mm']"):
    print(child.attrib, file=f)
    print(child.get('name'), file=f)
    print(child.text, file=f)

or

for child in root.findall(".//Variable[@name='Inboard_ED_mm']"):
    print(child.attrib, file=f)
    print(child.get('name'), file=f)
    print(child.text, file=f)

With the exact tagnames of tags 1 through 4 are, we could give you a more exact XPath, instead of relying on */*/*/*/.

Sign up to request clarification or add additional context in comments.

2 Comments

thanks for your response, after a bit of playing and searching it sits deep in the root node! I have added the code in the original question.
Ah - you are looking for the Variable tag with an exact name attribute. There is an XPath for that. See update.
1

Your root node is already the Variable tag, so you won't find anything with a Variable tag with findall, which can only search for child nodes. You should simply output the text attribute of the root node instead:

print(root.text)

1 Comment

thanks for you response, you answer has helped me find a solution.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.