i'm building a twitter grabber application using Python , Tweepy and MySQLdb modules
it will be fetching millions of tweets so performance is an issue i want to check if the tweet_id exists before in the table before adding it in the same Query
the table schema is :
*id* | tweet_id | text
_____|________________________|______________________________
1 | 259327533444925056 | sample tweet1
_____|________________________|______________________________
2 | 259327566714923333 | this is a sample tweet2
the code that i tried is but it do double Queries :
#check that the tweet doesn't exist first
q = "select count(*) from tweets where tweet_id = " + tweet.id
cur.execute(q)
result = cur.fetchone()
found = result[0]
if found == 0:
q = "INSERT INTO lexicon_nwindow (tweet_id,text) VALUES(tweet_id,tweet.text)
cur.execute(q)
making the Tweet_id unique and just insert the tweets , will raise exception and will not be efficient as well ?
so what's the best performing method to achieve this with one query ?