2

i m working with some python script, got a raw string with UTF8 encoding. first of all i decoded it to utf8 then some processing is done and at the end i encode it back to utf8 and inserted to DB(mysql) but chars in DB are not presented in real format.

str = '<term>Beiträge</term>'
str = str.decode('utf8')
...
...
...
str = str.encode('utf8')

after that string is found in txt file in its real form but in MYSQL_DB, i found it like this

 <term>"Beiträge</term>

any idea why this happened? :-(

2
  • 2
    check your db connection charset Commented May 17, 2011 at 13:11
  • 6
    str is not a good variable name. It hides the function str(). Also, use different variable names for different types of a variable. Commented May 17, 2011 at 13:11

2 Answers 2

1

Assuming you are using the MySQLdb library, you need to create connections using the keyword arguments:

use_unicode If True, text-like columns are returned as unicode objects using the connection's character set. Otherwise, text-like columns are returned as strings. columns are returned as normal strings. Unicode objects will always be encoded to the connection's character set regardless of this setting.

&

charset If supplied, the connection character set will be changed to this character set (MySQL-4.1 and newer). This implies use_unicode=True.

You should also check the encoding of your db tables.

Sign up to request clarification or add additional context in comments.

2 Comments

Connection's char set is utf8, and i m using MySQLdb
try utf8_bin (whatever the utf8 binary encoding is called in mysql, I didn't look it up).
0

To make a string a Unicode string you should use the stringprefix 'u'. See also here http://docs.python.org/reference/lexical_analysis.html#literals

Maybe your example works by just adding the prefix in the initial assignment.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.