0

I fail to save already encoded data into CSV. I could decode the CSV file afterwards, but I rather do all data cleaning before. I managed to save only text, but when I add timestamp it is impossible.

What I am doing wrong? I read that if srt() and .encode() is not working and should try .join instead, but still nothing

error:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 4: ordinal not in range(128) 

code:

def on_data(self, data):
    try:
        #print data
        tweet = data.split(',"text":"')[1].split('","source')[0]

        x = tweet.encode('utf-8')
        y = x.decode('unicode-escape')
        print y

        saveThis = y
        #saveThis = str(time.time())+'::' + tweet.decode('ascii', 'ignore')
        #saveThis = u' '.join((time.time()+'::'+tweet)).encode('utf-8')

        saveFile = open('twitDB.csv', 'a')
        saveFile.write(saveThis)
        saveFile.write('\n')
        saveFile.close()
        return True
    except BaseException, e:
        print 'fail on data,', str(e)
        time.sleep(5) 
def on_error(self, status):
    print status
1
  • 1
    Why are you manually splitting out JSON data? Use the json module instead; e.g. tweet = json.loads(data)['text']. This produces a Unicode value. Commented Jul 15, 2014 at 13:45

1 Answer 1

1

First of all, make sure you handle your JSON data properly, using the json module.

Next, don't catch BaseException, you have no reason to catch memory errors or keyboard interrupts here. Catch more specific exceptions, instead.

Next, encode your data before writing:

def on_data(self, data):
    try:
        tweet = json.loads(data)['text']
    except (ValueError, KeyError), e:
        # Not JSON or no text key
        print 'fail on data {}'.format(data)
        return

   with open('twitDB.csv', 'a') as save_file:
        save_file.write(tweet.encode('utf8') + '\n')
        return True
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.