0

I'm trying to write the HTML Code string from Google into file in Python 3.4

#coding=utf-8
try:
    from urllib.request import Request, urlopen  # Python 3
except:
    from urllib2 import Request, urlopen  # Python 2

useragent = 'Mozilla/5.0 (Windows NT 6.3; rv:36.0) Gecko/20100101 Firefox/36.0'

#Generate URL
url = 'https://www.google.com.tw/search?q='
query = str(input('Google It! :'))
full_url = url+query


#Request Data
data = Request(full_url)
data.add_header('User-Agent', useragent)
dataRequested = urlopen(data).read()
dataRequested = str(dataRequested.decode('utf-8'))


print(dataRequested)

#Write Data Into File
file = open('Google - '+query+'.html', 'w')
file.write(dataRequested)

It can print the string correctly, but when it write to file, it will show

file.write(dataRequested)
UnicodeEncodeError: 'cp950' codec can't encode character '\u200e' in position 97658: illegal multibyte sequence

I tried to change the decode way but it doesn't work. And i tried to replace \u200e too,but it will comes more encode charater error.

1 Answer 1

1

Your problem is

dataRequested = str(dataRequested.decode('utf-8'))

Is there a reason to convert decoded UTF-8 into a string? But that is not all. When you get a string from the Internet it should be decoded but when you save the string it should be encoded. Some guys don't get it. They either decode or encode. It doesn't work this way.

I altered your code a bit. It works fine for me on both Python2.7 and Python3.4.

dataRequested = dataRequested.decode('utf-8')


print(dataRequested)

#Write Data Into File
file = open('Google - '+query+'.html', 'wb')
file.write(dataRequested.encode('utf-8'))
Sign up to request clarification or add additional context in comments.

2 Comments

Come on! I just misspelled a couple of words.
Aha.I understand now. Thanks. For the life of me I could not see that is what you meant! The juxtaposition of "get string" and "don't get" was what looked confusing. Sorry. Would read better as Some people don't get it though. Thanks for fixing.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.