How would i design processing twitter data

Ask Question

Asked 6 years ago

Modified 6 years ago

Viewed 41 times

I am working on a project where a user can input different criteria which will be used to fetch tweets, lets call this action as TweetAnalysis. These tweets will then be sent to another internal system (REST API) to do some calculation and get results. Each tweet will have a unique result from REST API. For each TweetAnalysis created by users, there could be million of tweets and each tweet can have their respective results returned from API. (only 2 values from the results are aggregatable, every other value of the result is unique between tweets)

How would i design such a system?

What I was thinking is

User creates a TweetAnalysis (let's call it TA) and it is stored in db.
A separate process picks up a TA and retrieve all the respective tweets for it. These tweets can be dumped into an S3 object? While doing so, the S3 objects will be unique for each TA and can be broken down into chunks of 1000 tweets?
A separate process can pick up those S3 objects, gather their respective info from REST API system and persist the values in db?

asked Oct 19, 2019 at 1:59

Em Ae

2451 gold badge2 silver badges6 bronze badges

1

This is a very interesting question, but much too broad. You seem to consider using S3 as a kind of processing pipeline. So I'd recommend to you take a look at Streaming data, an excellent book about real-time data pipelines that need to scale. Given your domain, it would be a real investment, as it covers all the aspects you'll need to ba aware of.

Christophe
– Christophe

2019-10-19 08:52:19 +00:00
Commented Oct 19, 2019 at 8:52

Add a comment |

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Stack Exchange Network

How would i design processing twitter data

0

Hot Network Questions

How would i design processing twitter data

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Related

Hot Network Questions