I've got an application where you can login via SAML2. I'm using apache mellon module and getting data:
name = request.environ['MELLON_name']
email = request.environ['MELLON_mail']
From those data I create JWT using flask_jwt_simpe library. Then I want to call get_jwt_identity(), but the name in response has wrong encoding, it looks JiÅÃ Manes instead of Jiří Manes (Czech language). How can i solve this problem?
Edit #1: locale command output
LANG=en_US.utf8
LANGUAGE=
LC_CTYPE="en_US.utf8"
LC_NUMERIC="en_US.utf8"
LC_TIME="en_US.utf8"
LC_COLLATE="en_US.utf8"
LC_MONETARY="en_US.utf8"
LC_MESSAGES="en_US.utf8"
LC_PAPER="en_US.utf8"
LC_NAME="en_US.utf8"
LC_ADDRESS="en_US.utf8"
LC_TELEPHONE="en_US.utf8"
LC_MEASUREMENT="en_US.utf8"
LC_IDENTIFICATION="en_US.utf8"
LC_ALL=en_US.utf8
Edit #2: Solved it on my VPS by following python code:
name = bytearray(request.environ['MELLON_name'], 'iso-8859-1').decode('utf-8')
But I would like to have another universal solution :-/
environare passed through environment variables by, presumably, Apache/Mellon. It's storing UTF-8, but apparently Python/Flask doesn't know that, so it assumes environment variables are in your default locale, which appears to be Latin-1. So, you need to read them as raw bytes (so you can explicitlydecode('utf-8')them), or you need to configure Flask to override the default encoding, or you need to configure your system toen_US.UTF-8or something else appropriate. I'm not sure how you do the first two, but I'm sure it's in the Flask docs.flasktag to attract the resident Flask experts (and make it clear exactly how your server is getting launched/dispatched, or whatever else seems relevant).sys.getdefaultencoding(), what is returned?print(request.environ['MELLON_name'])prints out raw bytes it must be a bytestring. Is it prefixed with ablikeb'Ji\xc3\x85\xc2\x99\xc3\x83\xc2\xad Manes'? In that case something is not right, you can't decode bytestrings:bytearray(request.environ['MELLON_name'], 'iso-8859-1')This should throw an exception.print(type(request.environ['MELLON_name']))andprint(repr(request.environ['MELLON_name']))would be helpful.