1

I want to copy data from one database to another database. Therefore I wrote a Python script for this purpose.

Names are in german, but I don't think that will be a problem for understanding my question.

The script does the following

db = mysql.connect(db='', charset="utf8", use_unicode=True, **v.MySQLServer[server]);
...
cursor = db.cursor();

cursor.execute('select * from %s.%s where %s = %d;' % (eingangsDatenbankName, tabelle, syncFeldname, v.NEU))
daten = cursor.fetchall()

for zeile in daten:
    sql = 'select * from %s.%s where ' % (hauptdatenbankName, tabelle)
    ...
    for i in xrange(len(spalten)):
        sql += " %s, " % db_util.formatierFeld(unicode(str(zeile[i]), "utf-8"), feldTypen[i])

The method "db_util.formatierFeld" looks like this

def formatierFeld(inhalt, feldTyp):

    if inhalt.lower() == "none":
        return "NULL"    #Stringtypen
    if "char" in feldTyp.lower() or "text" in feldTyp.lower() or "blob" in feldTyp.lower() or "date".lower() in feldTyp.lower() or "time" in feldTyp.lower():
        return '"%s"' % inhalt 
    else:
        return '%s' % inhalt 

Well, to some of you this stuff will seem quite odd, but I can asure you I MUST do it this way, so please no discussion about style etc.

Okay, when running this code I get the following error message when I run into words with umlauts.

Traceback (most recent call last):
  File "db_import.py", line 222, in <module>
    main()
  File "db_import.py", line 219, in main
    importieren(server, lokaleMaschine, dbEingang, dbHaupt)
  File "db_import.py", line 145, in importieren
    sql += " %s, " %  db_util.formatierFeld(unicode(str(zeile[i]), "utf-8"), feldTypen[i])
UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 1: ordinal not in range(128)

Actually I do not understand why this string can't be build that way. I my opinion this should work since I explicitly tell the program to use unicode here.

Anybody has a guess what is going wrong here?

1 Answer 1

3

The error is made more difficult to interpret by the deep nesting of expressions you have.

In the line

sql += " %s, " % db_util.formatierFeld(unicode(str(zeile[i]), "utf-8"), feldTypen[i])

where does the exception come from? It's difficult to say. However, I would suppose that it comes from str(zeile[i]). If zeile[i] is unicode containing non-ASCII characters, then you cannot convert it to a byte string using str. Instead, you must encode it to a byte string using a codec which can represent all of the characters it contains.

However...

unicode(str(zeile[i]), "utf-8")

This is pointless, if zeile[i] is a unicode string. First you try to encode it to a byte string, then you try to decode it back into a unicode string. You could skip all that and just do zeile[i]. formatierFeld doesn't really matter because execution never gets that far.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.