I fail to save already encoded data into CSV. I could decode the CSV file afterwards, but I rather do all data cleaning before. I managed to save only text, but when I add timestamp it is impossible.
What I am doing wrong? I read that if srt() and .encode() is not working and should try .join instead, but still nothing
error:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 4: ordinal not in range(128)
code:
def on_data(self, data):
try:
#print data
tweet = data.split(',"text":"')[1].split('","source')[0]
x = tweet.encode('utf-8')
y = x.decode('unicode-escape')
print y
saveThis = y
#saveThis = str(time.time())+'::' + tweet.decode('ascii', 'ignore')
#saveThis = u' '.join((time.time()+'::'+tweet)).encode('utf-8')
saveFile = open('twitDB.csv', 'a')
saveFile.write(saveThis)
saveFile.write('\n')
saveFile.close()
return True
except BaseException, e:
print 'fail on data,', str(e)
time.sleep(5)
def on_error(self, status):
print status
jsonmodule instead; e.g.tweet = json.loads(data)['text']. This produces a Unicode value.