1

I have a script that stores content of random web pages into mysql database (MySQLdb). For some of the pages, I get:

...
File "/usr/lib/python2.7/dist-packages/MySQLdb/connections.py",
  line 264, in literal
return self.escape(o, self.encoders)                                         
File "/usr/lib/python2.7/dist-packages/MySQLdb/connections.py", 
  line 202, in unicode_literal
return db.literal(u.encode(unicode_literal.charset)) 

UnicodeEncodeError: 'latin-1' codec can't encode character
u'\u203a' in position 172550: ordinal not in range(256)

When I used sqlite3, I had no problems with that.

I tried this one, without success:

CREATE DATABASE the_base CHARACTER SET utf8

Question: How to encode/decode the data correctly so it is stored in the DB without any problems ever?

P.S. Character encoding under python is a never ending story...

Solved

Added encoding to connect method:

MySQLdb.connect( ... charset='utf8', use_unicode=True )

1 Answer 1

1

You can't store random unicode in an encoding that only has ~256 possible entries (ie. latin-1). Change the encoding in your database to e.g. utf-8 and you should be good to go.

Sign up to request clarification or add additional context in comments.

3 Comments

I just tried CREATE DATABASE the_base CHARACTER SET utf8 but still the same error
Make sure you have charset='utf-8' in your connection parameters (especially if your database and server have different encodings).
works! actually, it needs to be utf8 , utf-8 throws an error

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.