Return to Answer

added 522 characters in body

Source Link

edited Jul 11, 2010 at 22:11

11.7k
7
43
61

This will be safe forEDIT. Looks like your string is encoded in such a way that unicode(s)“ (LEFT DOUBLE QUOTATION MARK) becomes \x93 and ” (RIGHT DOUBLE QUOTATION MARK) becomes \x94. There is a number of codepages with such a mapping, CP1250 is one of them, so you may use this:

s = s.decode('utf-8''cp1250')

For all the codepages which map “ to \x93 see here (all of them also map ” to \x94, which can be verified here).

This will be safe for unicode(s):

s = s.decode('utf-8')

EDIT. Looks like your string is encoded in such a way that “ (LEFT DOUBLE QUOTATION MARK) becomes \x93 and ” (RIGHT DOUBLE QUOTATION MARK) becomes \x94. There is a number of codepages with such a mapping, CP1250 is one of them, so you may use this:

s = s.decode('cp1250')

For all the codepages which map “ to \x93 see here (all of them also map ” to \x94, which can be verified here).

Source Link

answered Jul 11, 2010 at 20:08

Bolo

11.7k
7
43
61

This will be safe for unicode(s):

s = s.decode('utf-8')

Collectives™ on Stack Overflow

Return to Answer