You know tech tools are cool, but unless you have a defined use case it's hard to put things into perspective and understand how different tools can interact with each other, help solve a problem or explore new use cases.
So to educate and motivate our technical buyers, sellers and customers, I created a fancy use case of ingesting live Twitter tweets and applying sentiment analysis to it. For this demo, i used the following tools
Twitter API : Realtime streaming data source
Red Hat AMQ Streams : Apache Kafka cluster to store real-time streaming data coming into the system
MongoDB : Storing tweets for long term persistence from Kafka into a schema-less NoSQL database
Red Hat OpenShift Container Storage : Used for providing RWO (in this project), RWX, Object Storage persistence storage for Kafka and MongoDB apps running on OpenShift
*Red Hat OpenShift Container Platform *: Enterprise grade k8s distribution for container apps
**Aylien : **Sentiment analysis solution backend
Python : Backend API app to trigger data sourcing from twitter, move data from Kafka to MongoDB, server data to frontend app
Frontend : basic HTLM, CSS, Javascript-based frontend to plot some graphs
This slide deck should give you a glimpse of how the demo would look like (demo youtube/github link below)
And here is the actual demo recording that you can go through, where i have explained how these components work together and make this a viable solution if you have a real-world use case along the same lines
If you are interested in running this demo by yourself, you can find the code in my repo, Github project link : https://github.com/ksingh7/twitter_streaming_app_on_openshift_OCS
Happy Analysing Live Tweets
Latest comments (6)
Awesome project @ksingh7! A thoughtful combination of the tools 💪
@andypiper I recently published a Twitter Search API on RapidAPI, it uses v2 of the Twitter API as well (with all the streaming filters). No Twitter developer account needed too.
rapidapi.com/microworlds/api/twitt...
Hope that helps 😊
Interesting, how does that correspond with the Twitter Developer Policy if individuals don’t need developer accounts? Also how does it handle rate and volume limits?
It's a totally a hobbyist project (which makes it unreliable). So yes, for something serious, using the official version is highly recommended here. Regarding dev policy, I'll say not so well (hence why the solution might not be so great). Each search returns at most ~3,200 result as it scrapes public data. Users are responsible for use of the scraped data.
This is cool, thanks for sharing! Have you looked into updating this to use the Twitter API v2 - there are much more powerful streaming filters available there.
Great to hear that, all the best
I’d definitely be interested if you do build a v2 version - I’m not sure whether there are Kafka examples for that yet.