2

I have seen this question I have doubts about how can I convert a var to unicode on running time ? Is it right use unicode function ? Are there other way to convert a string on running time ?

print(u'Cami\u00f3n') # prints with right special char

name=unicode('Cami\u00f3n')
print(name) # prints bad ===> Cami\u00f3n

name.encode('latin1')
print(name.decode('latin1')) # prints bad ===> Cami\u00f3n

encoded_id = u'abcd\xc3\x9f'
encoded_id.encode('latin1').decode('utf8')
print encoded_id.encode('latin1').decode('utf8') # prints right

I saw a lot of python unicode questions on stackoverflow but i can't understand this behaviour.

3
  • What are you trying to do? What data are you trying to convert? Where is it from? What does "on running time" mean? Commented Jun 16, 2015 at 11:39
  • \uhhhh escape sequences only work in Python unicode literals. If you have data with such escape sequences, you may well have JSON data instead, which uses the same syntax. If so, use a JSON parser for that data. Commented Jun 16, 2015 at 11:41
  • You can ask Python to interpret such sequences with a special codec, but that is usually the wrong interpretation of your data. Please share a sample of your actual data so we can help you with that. Commented Jun 16, 2015 at 11:42

1 Answer 1

6

Its just because of that if you don't specify any encoding for unicode function then :

unicode() will mimic the behaviour of str() except that it returns Unicode strings instead of 8-bit strings. More precisely, if object is a Unicode string or subclass it will return that Unicode string without any additional decoding applied.

So you'll have a str version of your unicode (the Unicode part will be escaped):

>>> name=unicode('Cami\u00f3n')
>>> print(name)
Cami\u00f3n
>>> name
u'Cami\\u00f3n'
       ^ 

For get ride of this problem you can use 'unicode-escape' as your encoding to escape converting the Unicode to string!

>>> name=unicode('Cami\u00f3n','unicode-escape')
>>> name
u'Cami\xf3n'
>>> print(name)
Camión
Sign up to request clarification or add additional context in comments.

3 Comments

Works like a charm. I tried to use it to print data from a database with special characters for code tests. Thanks!
Note that unicode-escape interprets more than just the \uhhhh escapes. If there are other `\` backslash-escapes in the text, those too will be interpreted, and may not be what you expected.
@Ulyarez Welcome, also note about Martijn's comment!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.