4

Here is my xml

<?xml version="1.0" encoding="UTF-16" standalone="no"?>
<table rows="3" cols="8" style="" render="1" datatype="2" tabletype="dynamictable1int" primary="" assetClass="Fixed Income" bookName="2015-05-20 LM BW GFI_CN" languageCode="12" languageWord="Chinese">
    <tbody>
        <row condition="TCH">
            <entry></entry>
            <entry> 1 个月 </entry>
            <entry> 3 个月 </entry>
            <entry> 1 年 </entry>
            <entry> 3 年 </entry>
            <entry> 5 年 </entry>
            <entry> 10 年 </entry>
            <entry> 从创建之日起 </entry>
        </row>
   </tbody>
</table>

When I am trying to run through this XMl the values I get,

['1 个月', '3 个月', '1 年', '3 年', '5 年', '10 年', '从创建之日起']

I am using python3,

table = XMLTable(xml, text_columns='ALL')
categoryNames = row2[0][1:]

When I tried to find the type of categoryNames I found it is returning <class,'str'>

I want this characters to come in Chinese as I am using them as Category Names for my bar chart, I am using reportlab to generate the chart. It is rendering the same values as the one I have shown above. While when I am running the same code using python 2.7 it is rendering perfectly.

Here is the code for XMLTable-

class XMLTable:
    error = object()    #unique

    @verboseError
    def __init__(self, xml, text_columns=[], floatDefault=0):
        xml = xml.replace('<emphasis>','',1000)
        xml = xml.replace('</emphasis>','',1000)
        xml = xml.replace('<para>','',1000)               
        xml = xml.replace('</para>','',1000)
        xml = xml.replace('<sub>','',1000)
        xml = xml.replace('</sub>','',1000)
        xml = xml.replace('<sup>','',1000)
        xml = xml.replace('</sup>','',1000)
        xml = xml.replace('<superscript>','',1000)
        xml = xml.replace('</superscript>','',1000)
        xml = xml.replace('<emphasis>','',1000)
        xml = xml.replace('</emphasis>','',1000)
        xml = xml.replace('<sbr>','',1000)
        xml = xml.replace('</sbr>','',1000)
        self.xml = xml
        self.data = []

        #print xml
        #pdb.set_trace()
        self.tree = pyRXPU.Parser().parse(xml)
        self.tags = dict()
        self.tags.update(self.tree[1])
        self.tableTag = NonEscapingTagWrapper(self.tree)
        if hasattr(self.tableTag, 'assetClass'):
            self.assetClass = self.tableTag.assetClass
        else:
            self.assetClass = None
        for i, row in enumerate(self.tableTag.tbody):
            if hasattr(row, 'selrow') and row.selrow == '0': continue
            newRow = []
            for col_no, entry in enumerate(row):
                if hasattr(entry, 'selcol') and entry.selcol == '0':
                    continue
                value = stripTags(tt2xml(entry._children))
                if text_columns != 'ALL':
                    if col_no not in text_columns:
                        v = re.sub(r"[^-\d\.]",'', value)
                        try:
                            value = float(v)
                        except ValueError:
                            if floatDefault is XMLTable.error:
                                annotateException('\nrow=%d col=%d value=%r cannot be converted to float' % (i,col_no,value))
                            else:
                                value = floatDefault
                newRow.append(value)
            self.data.append(newRow)

    @verboseError
    def getFormat(self, row, col):
        value = str(self.tableTag.tbody[row][col])
        return getNumberFormat(value)
    @verboseError
    def __len__(self):
        return len(self.data)

    @verboseError
    def __iter__(self):
        return self.data.__iter__()

    @verboseError
    def __getitem__(self, key):
        if isinstance(key, int):
            return self.data[key]
        elif isinstance(key, slice):
            return self.data[key]
        else:
            return self.tags[key]

    @verboseError
    def __setitem__(self, key, item):
        if isinstance(key, int):
            self.data[key] = item
        elif isinstance(key, slice):
            self.data[key] = item
        else:
            self.tags[key] = item

    @verboseError
    def __delitem__(self, key):
        if isinstance(key, int):
            del self.data[key]
        elif isinstance(key, slice):
            del self.data[key]
        else:
            del self.tags[key]

    @verboseError
    def keys(self):
        return self.tags.keys()

    @verboseError
    def items(self):
        return self.tags.items()

    @verboseError
    def get(self, key, default=None):
        if key in self.tags.keys():
            return self.tags[key]
        if default:
            return default
        raise KeyError('there is no such key %s'%key)
    @verboseError
    def __str__(self):
        return self.xml

    @verboseError
    def __repr__(self):
        return str(self.tree)

    @verboseError
    def clean(self, remove_if_zeros=None):
        for row in self.data:
            if row[remove_if_zeros] == 0:
                self.data.remove(row)
        return self
5
  • Where are you getting your XMLTable() from? Commented Sep 19, 2018 at 10:37
  • 1
    XMlTable is my own parser script. I am adding the script as well in the question itself. Commented Sep 19, 2018 at 10:49
  • I hope this help: stackoverflow.com/questions/3883573/… Commented Sep 19, 2018 at 10:57
  • and this: stackoverflow.com/questions/2688020/… Commented Sep 19, 2018 at 10:58
  • In both the stackoverflow answers they decoded the string but we cant decode the string over here. Commented Sep 19, 2018 at 12:35

1 Answer 1

2

Have you tried to open xml file with encoding format? for example. file name: abc.xml

xml = open(abc, encoding="utf-8").read()

this is worked for me.

Sign up to request clarification or add additional context in comments.

1 Comment

It worked, even I was getting the same error while writing to XML.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.