0

I'm given a XML file as follows.

<?xml version="1.0" encoding="UTF-8"?>
<A value="?">
    <B value="?">
        <C value="10"/>
        <C value ="20"/>
    </B>
    <B value="?">
        <C value = "5" />
        <C value = "10" />
    </B>
</A>

How can I sum the value of the children node to set up the parent recursively?

<?xml version="1.0" encoding="UTF-8"?>
<A value="45">
    <B value="30">
        <C value="10"/>
        <C value ="20"/>
    </B>
    <B value="15">
        <C value = "5" />
        <C value = "10" />
    </B>
</A>

2 Answers 2

2

The following code was run unchanged with Python 3.1.3 (shown) and Python 2.7.1 (not shown). The function which does all the work is version independent. You may want to change the other twiddly bits (parsing from a file instead of from a string, importing some other ElementTree implementation, etc) to suit yourself.

   >>> xml_in = """
    ... <A value="?">
    ...     <B value="?">
    ...         <C value="10"/>
    ...         <C value ="20"/>
    ...     </B>
    ...     <B value="?">
    ...         <C value = "5" />
    ...         <C value = "10" />
    ...     </B>
    ... </A>
    ... """
    >>> import xml.etree.ElementTree as et
    >>> def updated_value(elem):
    ...     value = elem.get('value')
    ...     if value != '?': return int(value)
    ...     total = sum(updated_value(child) for child in elem)
    ...     elem.set('value', str(total))
    ...     return total
    ...
    >>> root = et.fromstring(xml_in)
    >>> print("grand total is", updated_value(root))
    grand total is 45
    >>> import sys; nbytes = sys.stdout.write(et.tostring(root) + '\n')
    <A value="45">
        <B value="30">
            <C value="10" />
            <C value="20" />
        </B>
        <B value="15">
            <C value="5" />
            <C value="10" />
        </B>
    </A>
    >>>
Sign up to request clarification or add additional context in comments.

2 Comments

if there are elements without value attribute then updated_value() won't work due to int(None) raises TypeError.
@J.F. Sebastian: I was kinda hoping that the OP could make the necessary adjustments if his sample data differed from the real world. In fact it "won't work" if an element has no value attribute or if it contains other than "?" or an int-worthy string. The OP gave no indication of what action to take in such contingencies; possibilities for each case include (1) exception (2) return 0 immediately (3) continue to the sum(children) phase (which would "fix" the offending element).
1

If you need specifically a recursive solution then @John Machin's answer is fine. But you could do it iteratively:

from xml.etree import cElementTree as etree # adjust it for your python version

for ev, el in etree.iterparse('you_file.xml'):
    if el.get('value') == '?':
       el.set('value', str(sum(int(n.get('value')) for n in el)))

print(etree.tostring(el))

Output

<A value="45">
    <B value="30">
        <C value="10" />
        <C value="20" />
    </B>
    <B value="15">
        <C value="5" />
        <C value="10" />
    </B>
</A>

2 Comments

Your answer "won't work" either: if n.get('value') returns other than an int-worthy string, an exception will be raised. If a non-leaf element has no value attribute, it will fail silently.
@John Machin: my answer sums only the immediate children of an element with a 'value="?"' attribute. It should fail if the value is not an int-worthy. It is not the same as to require all elements to have a valid value attribute as in your case. Anyway the question smells like a homework so to what end are we discussing here?

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.