Unicode encoding error in python

Question

I have characters \u002d, \u2019, u\2022, \u25ba, \u2013 etc, coming in my data. I have to do json.loads(data)

I tried doing

data1 = data.encode('utf-8')
json.loads(data1)

I still get an error.

Also tried the below but ended up in an error

b1 = data.encode('ascii', 'ignore')
b2 = json.loads(b1)

It works if I replace the characters in my data, like '\u002d' to '-', but I do not know what other characters might creep in. So I am looking for a solution which would encode these characters

Martijn Pieters · Accepted Answer · 2013-08-19 07:58:24Z

2

There is no need to encode the data.

Feed it directly to json.loads(); the JSON standard uses \u.... escape codes to denote unicode values too.

The values are not encoded in UTF-8, the Python json module will handle them for you.

Even if the data was encoded in UTF-8, the json module will handle that for you as well. Even if it didn't, you'd use str.decode(), not encode.

UTF-8 data looks different as well; the U+2019 codepoint looks like:

>>> u'\u2019'.encode('utf8')
'\xe2\x80\x99'

when encoded to UTF-8.

answered Aug 19, 2013 at 7:58

Martijn Pieters

1.1m325 gold badges4.2k silver badges3.4k bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

user1946217 Over a year ago

Yes it is working. But now I am not able to write it to a file. It says: Traceback (most recent call last): File "C:\Python27\AureusBAXProject.py", line 202, in <module> outfile.writerows(outlist) UnicodeEncodeError: 'ascii' codec can't encode character u'\u2022' in position 0: ordinal not in range(128)

Martijn Pieters Over a year ago

@user1946217: Then use io.open() to open your output file too. Your unicode data needs to be encoded in that case. What encoding you do that in depends on what you need to do with the output CSV.

Collectives™ on Stack Overflow

Unicode encoding error in python

1 Answer 1

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Related