2

I am trying to run my code on all xml files in the folder I get a few errors when I run the code and it generates some files but not all

here is my code:

import xml.etree.ElementTree as ET
import os
import glob
path = 'C:/xml/'

for infile in glob.glob( os.path.join(path, '*.xml') ):
        tree = ET.parse(infile)
        root = tree.getroot()
        with open(infile+'new.csv','w') as outfile:
            for elem in root.findall('.//event[@type="MEDIA"]'):
                    mediaidelem = elem.find('./mediaid')
                    if mediaidelem is not None:
                            outfile.write("{}\n".format(mediaidelem.text))

here is the error log all the

Traceback (most recent call last):
  File "C:\xml\2.py", line 8, in <module>
    tree = ET.parse(infile)
  File "C:\Python34\lib\xml\etree\ElementTree.py", line 1187, in parse
    tree.parse(source, parser)
  File "C:\Python34\lib\xml\etree\ElementTree.py", line 598, in parse
    self._root = parser._parse_whole(source)
  File "<string>", line None
xml.etree.ElementTree.ParseError: no element found: line 1, column 0
3
  • 5
    Please post at least the errors (traceback). And since you're not doing any exception handling, the code will very obviously crash on the first error, which is why it does not process all files... Commented Oct 29, 2015 at 12:36
  • you are correct, one of the files didn't have the tag and it stopped once it tried to run the code on it, trying to think on where to add the if statement to prevent this Commented Oct 29, 2015 at 13:05
  • According to the error apparently you have some empty files. As @brunodesthuilliers already stated, you have not error handling. Maybe you should do it and depending on error you just skip that file (and also print a warning message). Commented Oct 29, 2015 at 13:06

1 Answer 1

2

Considering the error message you may have some empty (or malformed) files.

I would add a error handling here to warn user about such error and then skip the file. Something like:

for infile in glob.glob( os.path.join(path, '*.xml') ):
    try:
        tree = ET.parse(infile)
    except xml.etree.ElementTree.ParseError as e:
        print infile, str(e)
        continue
    ...

I did not tried to reproduce it here, it is just a guess.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.