Decode unicode string in python

Question

I'd like to decode the following string:

t\u028c\u02c8m\u0251\u0279o\u028a\u032f

It should be the IPA of 'tomorrow' as given in a JSON string from http://rhymebrain.com/talk?function=getWordInfo&word=tomorrow

My understanding is that it should be something like:

x = 't\u028c\u02c8m\u0251\u0279o\u028a\u032f'
print x.decode()

I have tried the solutions from here , here , here, and here (and several other that more or less apply), and several permutations of its parts, but I can't get it to work.

Thank you

Justin O Barber · Accepted Answer · 2014-03-15 01:29:22Z

1

You need a u before your string (in Python 2.x, which you appear to be using) to indicate that this is a unicode string:

>>> x = u't\u028c\u02c8m\u0251\u0279o\u028a\u032f'  # note the u
>>> print x
tʌˈmɑɹoʊ̯

If you have already stored the string in a variable, you can use the following constructor to convert the string into unicode:

>>> s = 't\u028c\u02c8m\u0251\u0279o\u028a\u032f'  # your string has a unicode-escape encoding but is not unicode
>>> x = unicode(s, encoding='unicode-escape')
>>> print x
tʌˈmɑɹoʊ̯
>>> x
u't\u028c\u02c8m\u0251\u0279o\u028a\u032f'  # a unicode string

edited Mar 15, 2014 at 1:29

answered Mar 15, 2014 at 1:15

Justin O Barber

11.6k2 gold badges43 silver badges45 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

aspasia Over a year ago

your edit was exactly what I was trying to figure out just now, thanks for the forethought.

Collectives™ on Stack Overflow

Decode unicode string in python

1 Answer 1

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Linked

Related