0

This question aims at the following two scenarios:

  1. You want to add a string with special characters to a variable:

    special_char_string = "äöüáèô"

  2. You want to allow special characters in comments.

    # This a comment with special characters in it: äöà etc.

At the moment I handle this this way:

# -*- encoding: utf-8 -*-
special_char_string = "äöüáèô".decode('utf8')
# This a comment with special characters in it: äöà etc.

Works fine.

Is this the recommended way? Or is there a better solution for this?

2
  • 1
    Why .decode() ? Define a unicode string! u"äöüáèô" Commented Jun 22, 2011 at 12:15
  • @phant0m, I am confused. Do you suggest, that u"äöüáèô" is the same as "äöüáèô".decode('utf8') in any case? Commented Jun 22, 2011 at 12:21

2 Answers 2

4

Python will check the first or second line for an emacs/vim-like encoding specification.

More precisely, the first or second line must match the regular expression "coding[:=]\s*([-\w.]+)". The first group of this expression is then interpreted as encoding name. If the encoding is unknown to Python, an error is raised during compilation.

Source: PEP 263

(A BOM would also make Python interpret the source as UTF-8.

I would recommend, you use this over .decode('utf8')

# -*- encoding: utf-8 -*-
special_char_string = u"äöüáèô"

In any case, special_char_string will then contain a unicode object, no longer a str. As you can see, they're both semantically equivalent:

>>> u"äöüáèô" == "äöüáèô".decode('utf8')
True

And the reverse:

>>> u"äöüáèô".encode('utf8')
'\xc3\xa4\xc3\xb6\xc3\xbc\xc3\xa1\xc3\xa8\xc3\xb4'
>>> "äöüáèô"
'\xc3\xa4\xc3\xb6\xc3\xbc\xc3\xa1\xc3\xa8\xc3\xb4'

There is a technical difference, however: if you use u"something", it will instruct the parser that there is a unicode literal, it should be a bit faster.

Sign up to request clarification or add additional context in comments.

Comments

2

Yes, this is the recommended way for Python 2.x, see PEP 0263. In Python 3.x and above, the default encoding is UTF-8 and not ASCII, so you don't need this there. See PEP 3120.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.