4

Recently, I have been reading about the Python source code encoding, especially PEP 263 and PEP 3120.

I have the following code:

# coding:utf-8

s = 'abc∂´ƒ©'
ƒ = 'My name is'
ß = '˚ß˙ˆ†ˆ∆ ßå®åø©ˆ'
print('s =', s)
print('ƒ =', ƒ, 'ß =', ß)

This code works fine for Python3 but results in a SyntaxError in Python2.7 .
I do understand that this probably might have nothing to do with source code encoding.
So, I would like to know if there is a way to support Unicode variable names in Python2.

In all, I am also having a hard time figuring out what pragmatic problem the PEPs exactly aim to solve and how(and where) do I take advantage of the proposed solutions. I have read few discussions on the same but they do not present an answer to my question rather an explanation of the correct syntax:

2 Answers 2

8

No, Python 2 only supports ASCII names. From the language reference:

identifier ::=  (letter|”_”) (letter | digit | “_”)*
letter     ::=  lowercase | uppercase
lowercase  ::=  “a”…”z”
uppercase  ::=  “A”…”Z”
digit      ::=  “0”…”9”

Compared that the much longer Python 3 version, which does have full Unicode names.

The practical problem the PEPs solve is that before, if a byte over 127 appeared in a source file (say inside a unicode string), then Python had no way of knowing which character was meant by that as it could have been any encoding. Now it's interpreted as UTF-8 by default, and can be changed by adding such a header.

Sign up to request clarification or add additional context in comments.

3 Comments

I am sorry but I am unable to understand the meaning of "a byte over 127"? Do you mean to say that the ASCII code of a character is over 127?
Yes. ASCII defines the meanings of bytes 0 to 127. Almost all encodings you'll see encode those values the same as ASCII. But values over 127 are not ASCII and are usually completely different characters in different encodings.
This is the classic article: joelonsoftware.com/2003/10/08/… .
1

I don't think that those two articles are about encoding in the sense of your variable name being a Beta-symbol for example, but regarding the encoding in the variable value.

so if you change your code to this example:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

a = 'abc?´ƒ©'
b = 'My name is'
c = '°ß?ˆ†ˆ? ßå®åø©ˆ'
print 'a =', a # by the way, the brackets are only used in python 3, so they are also being displayed when running the code in python 2.7
print 'b =', b, 'c =', c 

Hope that answers your question

Greetings Frame

2 Comments

This would be a hack around the problem rather than a solution. BTW, my problem here is interoperability between Python2 and Python3.
@KshitijSaraogi you can't expect perfect interoperability between the versions, there are things you can do in Python 3 that you simply can't do in Python 2. Special characters for variable names is one of those things.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.