Python string and UTF-8 problems

Question

I am programming a script that will grab some data from my website using http GET.

My problem is that i have to pass unicode characters to the website.

I am reading a file that contains these characters and then i try produce a url in order to make the request.

The file is utf-8 encoded and i use this to read from it

f = codecs.open("values.txt", encoding='utf-8')

then i read the first line of the file and i am concatenating the value with the url

sUrl = "http://example.com?word="
value = f.readline()
visitUrl = sUrl + value

if i use print visitUrl the output is correct. i.e http://example.com?word=π

How to use visiUrl without destroying my special characters? I tried to encode the string to ascii but it doesn't work for all characters.

Emil Ivanov · Accepted Answer · 2011-08-05 11:27:32Z

3

Quote the url

import urllib
s = u'Здравей'
urllib.quote(s.encode('utf-8'))
# %D0%97%D0%B4%D1%80%D0%B0%D0%B2%D0%B5%D0%B9

or use urlencode directly to build the query part of the url

urllib.urlencode({'data': s.encode('utf-8')})
# 'data=%D0%97%D0%B4%D1%80%D0%B0%D0%B2%D0%B5%D0%B9'

answered Aug 5, 2011 at 11:27

Emil Ivanov

37.7k12 gold badges78 silver badges92 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

kechap Over a year ago

should i choose urllib or urllib2?

Emil Ivanov Over a year ago

@messkech: Those functions are in urllib. Don't let the name of urllib2 confuse you that it's an alternative library - it's actually an extension of urllib and both libraries have been merged in Python 3.

Wooble · Accepted Answer · 2011-08-05 11:26:45Z

1

Build the URL with urllib.urlencode rather than trying to construct it by concatenating strings. Non-ASCII characters in a URL need to be URL encoded.

answered Aug 5, 2011 at 11:26

Wooble

90.5k12 gold badges111 silver badges132 bronze badges

Collectives™ on Stack Overflow

Python string and UTF-8 problems

2 Answers 2

2 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Related