Django: tracking down DjangoUnicodeDecodeError error

Question

While attempting to render a template I'm getting the following error:

DjangoUnicodeDecodeError: 'utf8' codec can't decode bytes in position 26-27: invalid data. You passed in '\xce\x88\xce\xbe\xce\xbf\xce\xb4\xce\xb1 \xcf\x83\xcf\x84\xce\xb7\xce\xbd \xce\xb5\xcf\x81\xce\xb3\xce...' (<type 'str'>)

The template is fairly large and complex, so I'm hoping for some tips on how to track down where exactly this is coming from.

A few facts that might be helpful:

The template is generally unicode friendly; we display a fair amount of unicode data through it
The mysql table the data is coming from has utf8 encoding
This is a strange one: The error doesn't show up on my staging server when using the same code base and the same production data. The setup is very similar to the production server: Python 2.5.1, Django-1.1.1, mysql 5.0.38, ubuntu.

I'm not sure where exactly to look for the badly encoded data, any hints or pointers would be appreciated.

Ignacio Vazquez-Abrams · Accepted Answer · 2010-11-24 18:37:16Z

4

Somewhere you're truncating a string, but you're doing it on a str instead of a unicode so you end up splitting a UTF-8 character sequence in half. Always perform text operations on unicode, never str.

answered Nov 24, 2010 at 18:37

Ignacio Vazquez-Abrams

803k160 gold badges1.4k silver badges1.4k bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Parand Over a year ago

Aaah, excellent suggestion, will try it when I get back in front of the computer.

Parand Over a year ago

Ignacio was dead on. I'd written a tag to truncate titles and had used str() instead of unicode() to convert the tag parameters to strings. Switched it over to unicode() and the problems went away.

Alper Over a year ago

This just bit me too. Should we then ever again use str()?

John Machin · Accepted Answer · 2010-11-24 19:56:11Z

What is reported by the exception is 26 bytes of valid UTF-8 followed by '\xce...'

It looks very much to me that some piece of software, either in your code or in Django's code is doing something like this:

def too_big_display(strg, maxlen):
    return strg[:maxlen-3] + "..."

and in your case calling it with too_big_display(your_Greek_text_encoded_in_utf8,30)

and so you are seeing a secondary error ... \xce. is not valid UTF-8.

I suggest that you look very carefully through the traceback (which you should have shown us, and still can, by editing your question) to see whether there is any evidence of a primary error. If not, scrutinise your code for such a truncation.

quanalyst · Accepted Answer · 2013-02-27 22:16:58Z

0

In case somebody has similar situation like mine: I recently changed a MySQL table to use collation utf8_bin and ran into the same problem. I found out that in staging I have MySQL-python 1.2.3. Upgrading to 1.2.4 solved the problem for me. I am using python2.7, Django1.4.2.

answered Feb 27, 2013 at 22:16

quanalyst

232 bronze badges

Collectives™ on Stack Overflow

Django: tracking down DjangoUnicodeDecodeError error

3 Answers 3

3 Comments

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

Comments

Comments

Related