0

Ho would I properly encode the following:

# # -*- coding: utf-8 -*-

>>> 'What\x80\x99s Up: Balloon to the Rescue!'.encode('utf-8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 4: ordinal not in range(128)
>>> 'What\x80\x99s Up: Balloon to the Rescue!'.decode('utf-8')
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
 File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 4: invalid start byte

2 Answers 2

3

You've got two issues here. First, your UTF-8 byte sequence is wrong; it should be \xe2\x80\x99. You are also using the wrong function; you need to decode it from UTF-8:

>>> print 'What\xe2\x80\x99s Up: Balloon to the Rescue!'.decode('utf-8')
What’s Up: Balloon to the Rescue!
Sign up to request clarification or add additional context in comments.

3 Comments

Did you just guess which character OP meant?
@beerbajay Yes. Given two of the bytes are the same and it makes perfect sense in the context, I think I'm probably right. ;)
@spencercw Yea, this seems to be a problem upstream. That was the value I had in the database, so I think prior to that I need to encode to utf.
0
>>> type('What\x80\x99s Up: Balloon to the Rescue!')
<type 'str'>

So you can't encode it since it is not Unicode.

What is your Unicode input?

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.