1

i am selecting values from a MySQL // Maria DB that contains latin1 charset with latin1_swedish_ci collation. There are possible characters from different European language as Spanish ñ, German ä or Norwegian ø.

I get the data with

#!/usr/bin/env python3
# coding: utf-8

...
sql.execute("SELECT name FROM myTab")
for row in sql
 print(row[0])

There is an error message: UnicodeEncodeError: 'ascii' codec can't encode character '\xf1' Okay I have changed my print to

print(str(row[0].encode('utf8')))

and the result looks like this: b'\xc3\xb1' i looked at this Working with utf-8 encoding in Python source but i have declard the header. Also decode('utf8').encode('cp1250') does not help

2
  • thanks for supporting. this returnes UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf1 in position 0 Commented Jun 19, 2017 at 23:23
  • Possible duplicate of How to set sys.stdout encoding in Python 3? Commented Jun 26, 2017 at 21:23

2 Answers 2

3

okay the encoding issue has been solved finaly. Coldspeed gave a important hind with loacle. therefore all kudos for him! Unfortunately it was not that easy.

I found a workaround that fix the problem.

import sys
sys.stdout = open(sys.stdout.fileno(), mode='w', encoding='utf8', buffering=1)

The solution is from Jack O'Connor. posted in this answer:

Sign up to request clarification or add additional context in comments.

2 Comments

+1 as this has allowed me to move forward. However, shouldn't this be written in flashing lights at the top of somewhere like docs.python.org/3/howto/unicode.html? My issue relates to using a jinja2 template. Where the template doesn't contain any unicode everything is OK, however, once there is a single unicode character somewhere in the template it breaks. My system locale is 'en_US.UTF-8' and no amount of encode/decode solved the problem. But the above just feels like such a fundamental thing that it cannot be the "correct way"?
A thousand time this! How is this not the default in 2018 :/
1

Python3 tries to automatically decode this string based on your locale settings. If your locale doesn't match up with the encoding on the string, you get garbled text, or it doesn't work at all. You can forcibly try encoding it with your locale and then decoding to cp1252 (it seems this is the encoding on the string).

print(row[0].encode('latin-1').decode('cp1252'))

5 Comments

seems the point with locale directs to the goal. unfortunately your approach still does not brings the correct solution. But with locale i am getting closer.
@JoePlatano what about row[0].encode('latin-1').decode('utf-8')?
no does not work, well it does on shell if i exec the script as python script.py it works. On the webserver not. I added the following lines print(sys.stdout.encoding)and print(sys.getdefaultencoding()) in shell there is utf-8 for both. if i execute the script on browser there is ANSI_X3.4-1968 for sys.stdout.encoding and utf-8 for sys.getdefaultencoding(). I think there is some locale issue on apache
@JoePlatano Oh, I see... afraid I'm at a loss here. Hope you figure it out! You should try different encodings and see which works.
yeah thanks anyway for pushing me in a good direction! Therefore the upvote. Thanks buddy

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.