0

I have the following xml structure.(this is part of it) . actually it has TVEpisode','TVShow','Movie','TVSeries','TVSeason', i need to go through the xml file and check for the decription element. if it is not there, need to add a description element under the above mentioned types (mvoes,TVseries etc) and insert Title of the movie,tvepisode etc as the description.

<TVSeries>
<Provider>xxx</Provider>
<Title>The World's Fastest Indian</Title>
<Description> The World's Fastest Indian </Description>
<SortTitle>World's Fastest Indian, The</SortTitle>
</TvSeries>

<Movies>
<Provider>xxx</Provider>
<Title>The World's Fastest Indian</Title>
<Description> The World's Fastest Indian </Description>
<SortTitle>World's Fastest Indian, The</SortTitle>
</Movies>

<TVShow>  
<Provider>xxx</Provider>
<Title>The World's Fastest Indian</Title>
<SortTitle>World's Fastest Indian, The</SortTitle>
</TvShow>

Under tvshow there is no description element, so I need to insert following to that:

<Description> The World's Fastest Indian </Description>

Part of the xml file:

<Feed xml:base="http://schemas.yyyy.com/xxxx/2011/06/13/ingestion"  xmlns="http://schemas.yyy.com/xxxx/2011/06/13/ingestion">
<Movie>
<Provider>xxx2</Provider>
<Title>The World's Fastest Indian</Title>
<SortTitle>World's Fastest Indian, The</SortTitle>
</Movie>
<TVSeries>
<Provider>xxx</Provider>
<Title>The World's Fastest Indian</Title>
<Description> The World's Fastest Indian </Description>
<SortTitle>World's Fastest Indian, The</SortTitle>
</TvSeries>

I need to walk-through the xml file and need to insert the element, "description" , if description is not present(and also need to add some text to the description).

This is what I have done.this can give me the titles which doesn't have a Description. But when I try to insert element to the structure it gives me the following error:

  File "/usr/lib/python2.4/site-packages/elementtree/ElementTree.py", line 293, in insert
   assert iselement(element)
   AssertionError

Code:

import elementtree.ElementTree as ET
import sys
import re
output_namespace='http://schemas.yyy.com/xxx/2011/06/13/ingestion'

types_to_remove=['TVEpisode','TVShow','Movie','TVSeries','TVSeason']

if ET.VERSION[0:3] == '1.2':
#in ET < 1.3, this is a workaround for supressing prefixes
def fixtag(tag, namespaces):
    import string
    # given a decorated tag (of the form {uri}tag), return prefixed
    # tag and namespace declaration, if any
    if isinstance(tag, ET.QName):
        tag = tag.text
    namespace_uri, tag = string.split(tag[1:], "}", 1)
    prefix = namespaces.get(namespace_uri)
    if namespace_uri not in namespaces:
        prefix = ET._namespace_map.get(namespace_uri)
        if namespace_uri not in ET._namespace_map:
            prefix = "ns%d" % len(namespaces)
        namespaces[namespace_uri] = prefix
        if prefix == "xml":
            xmlns = None
        else:
            if prefix is not None:
                nsprefix = ':' + prefix
            else:
                nsprefix = ''
            xmlns = ("xmlns%s" % nsprefix, namespace_uri)
    else:
        xmlns = None
    if prefix is not None:
        prefix += ":"
    else:
        prefix = ''

    return "%s%s" % (prefix, tag), xmlns

ET.fixtag = fixtag
ET._namespace_map[output_namespace] = None
else:
    #For ET > 1.3, use register_namespace function
      ET.register_namespace('', output_namespace)



def descriptionAdd(root,type):
     for child in root.findall('.//{http://schemas.yyy.com/xxx/2011/06/13/ingestion}%s' % type):
        title=child.find('.//{http://schemas.yyy.com/xxx/2011/06/13/ingestion}Title').text
        try:
                if child.find('.//{http://schemas.yyy.com/xxx/2011/06/13 /ingestion}Description').text=="":
               print("")
        except:
            print ' %s - couldn\'t find description' % (title)
            print(child.tag)
            child.insert(2,"Description")

 ####Do the actual work and writing new changes to the new xml file.

    tree = ET.parse(sys.argv[1])
    root = tree.getroot()
    for type in types_to_remove:

          descriptionAdd(root,type)

    tree.write(sys.argv[2])

1 Answer 1

1

Updated

I see what you want now, I think. Below is how I would do it. Note that you will need to apply this to the parent element that contains a movie, TV show, etc. Also note that case matters (see note in code below).

First, the function:

def insert_description(element):
    '''Inserts the Title as a Description if Desscription not present.'''
    for sub_e in element:
        if sub_e.find('Description') is None:
            title = sub_e.find('Title').text
            new_desc = ET.Element('Description')
            new_desc.text = title
            sub_e.insert(2, new_desc)

Now to test the function:

>>> xml = '''
<Root>
 <Movie>
  <Provider>xxx2</Provider>
  <Title>The World's Fastest Indian</Title>
  <SortTitle>World's Fastest Indian, The</SortTitle>
 </Movie>
 <TVSeries>
  <Provider>xxx</Provider>
  <Title>The World's Fastest Indian</Title>
  <Description> The World's Fastest Indian </Description>
  <SortTitle>World's Fastest Indian, The</SortTitle>
  </TVSeries> // note that I changed the v to an upper-case V
</Root>'''
>>> root = ET.fromstring(xml)
>>> insert_description(root)
>>> print ET.tostring(root)
<Root>
 <Movie>
  <Provider>xxx2</Provider>
  <Title>The World's Fastest Indian</Title>
  <Description>The World's Fastest Indian</Description>
  <SortTitle>World's Fastest Indian, The</SortTitle>
 </Movie>
 <TVSeries>
  <Provider>xxx</Provider>
  <Title>The World's Fastest Indian</Title>
  <Description> The World's Fastest Indian </Description>
  <SortTitle>World's Fastest Indian, The</SortTitle>
 </TVSeries> // note that I changed the v to an upper-case V
</Root>

I formatted the latter output with indentation to make what has happened clearer.

Sign up to request clarification or add additional context in comments.

6 Comments

Thanks for the response, but it is not working, it is giving the following error.AttributeError: 'str' object has no attribute 'makeelement'
@Tharanga Abeyseela Whoops, sorry. That was silly of me. I put a string in the OTHER wrong place to put a string. I updated the answer.
@Tharanga Abeyseela Updated one more time. I had your variables confused.
sorry my explanation is not very good, i just edited it. actually script need to go through the xml and check for the description, under tvshow,movies,series,season etc..and if description is not there, need to add the description element and then add its title as the description text.
@Tharanga Abeyseela Oh, I see. I have updated my answer in light of your updates, and I have provided a function that works well for me. If you choose to implement this function, you will just need to make sure you enter the parent element that holds Movie, TVShow, etc. as the parameter.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.