When I set out to create the twitter tweep blocker bot, I was open to using any stack. I learn by doing, so even if Perl is the most efficient solution, I’ll learn Perl for the purpose of the project.
After hours of googling and tweets from folks like @theshalvah, I understood that in order to be able to block on behalf of a user, the user has to grant the bot authentication and it’s a 3-legged process.
Some parts of Twitter API Docs were probably written for geniuses and I’m not one, so I needed an example implementation of the 3-legged authentication. That’s the hardest part of the bot I’m building (or so I thought).
I searched google and Github, including twitter-dev GitHub repositories until I eventually found a Python implementation on Github. I quickly cloned the repo for testing (I used keys generated from my personal account). After a few trials and errors (you know Python), it worked. Problem solved, isn’t it? No! I was faced with challenge 2
I’ve used Python Tweepy before for tweeting (text and media), not so sure it covers blocking. So, I googled python package for blocking and I found python-twitter package. That means I’ll be using 2 twitter packages at once. I was able to easily impersonate after getting the package
I’d decided to deploy to Heroku. Heroku uses gunicorn to start a flask app and I need the streaming to start when the flask app starts too. The Procfile only allows one command as far as I know. After some head “cracking”, I decided to start the streaming as a thread when the flask app starts.
So, I created a new twitter account for the bot and applied for a developer account. While waiting for twitter to approve, I decided to go update the bot profile. Saw dob field and decided to indicate the dob of the bot December 2019 so users can know when it started.
Guess what. Twitter suspended the account immediately on grounds that they only allow users above the age of 13 (and I’m claiming to be few days old!).
I’ve had suspension issue with twitter before, and I know it’s almost never resolved. So after waiting a few hours and no response, I decided to go use a bot account I created some months back for an experiment.
So, all is set. The bot is ready and deployed to Heroku. I shared the new development on twitter and it was well-received. Suddenly, the bot began to reply itself and @theshalvah called my attention to the fact that since the bot is not listening for any keyword, it’ll keep getting triggered anytime it’s quoted/mentioned.
@dara_tobi suggested a few fixes too. After trying different combinations, I realised it’s best the bot listens for a keyword (as do most twitter bots I’ve seen). I deployed this fix and it fixed that issue.
Even though people were trying to use the bot, they couldn’t authenticate cos the app was breaking. @bigbrutha_ called my attention to the auth issue. Turns out I forgot to set the Heroku URL as a callback. So, I decided to the Heroku URL on the twitter app as well as my environment variable.
However, the problem persisted. Then I saw that I used HTTP in my environment variable, but https in the callback URL set on twitter app. I decided to leave both the HTTP and https version on twitter app.
I used SQLite for saving data on the app since it’s a file. But I noticed that each time I deploy the app, it asks me to authenticate again. Meaning the previous authentication is lost. Then @dara_tobi called my attention to a limitation of using SQLite on Heroku i.e DB gets flushed regularly. @theshalvah suggested I used PostgreSQL as there’s a free tier on Heroku. In my own search, I found that MySQL is supported and there’s a free tier too. So, MySQL it is! Had to modify the queries a little bit since there are slight differences in how you write SQLite queries and MySQL queries.
So, MySQL is up and running. There shouldn’t be any data loss, isn’t it? I wish! I noticed the app keeps breaking on step 2 and it’s MySQL error. The error was saying MySQL is unavailable. After googling, realised it’s because the connection has been lost. I had created a single connection and use that connection throughout the app. Turns out, the connection is gone after it’s used the first time. This issue made me realise why I love frameworks (Laravel comes to mind).
After some trials and errors, I realised the working solution is to connect to the DB each time I need to interact with the database. So, I created a DB connection function and called it before any SQL query. There might be a more efficient fix, but my current fix works, so…
The bot was working fine but at a point, twitter started returning an error that the tweet I’m trying to post is duplicate. Then I remembered a time @theshalvah and someone was discussing using random texts when bot tweets. So, I googled “hello in various languages” to use randomize the texts. Ended up with 10 randomized texts to use each time the bot needs to post.
I wanted the bot to reply to the tweet that mentioned it, without mentioning the user. Turns out it’s not possible (with tweepy at least). When I tried it the bot started posting tweets that are not replies to any tweet, so the user that mentioned the bot can’t see the reply (cos it’s a tweet, not a reply)
This issue frustrated me a lot. Every other thing on the app works fine, but when the bot will reply a tweet, it’ll post 2 replies instead of one. Checking the logs, I also saw that the server starts streaming twice.
I tried is to wrap the streaming call in a try, and when the exception is that streaming had already started, it should exit the script. This didn’t solve the problem.
Since I noticed that the server creates 2 processes whenever the flask app launches and googling informed me that the second process is for auto-refresh. I added a check to not start the streaming if it’s a reload. Didn’t work.
I added a sleep to the tweet function maybe it’ll prevent duplicate tweet. Didn’t work.
Tried finding all processes and deleting before creating a new one. Didn’t work.
Tried calling the script in Procfile, didn’t work (error)
Tried saving the process ID (PID) of the streaming in a file and check if that file exists before launching streaming.
What worked, for now, is creating a lock file that when streaming starts and refuse to start streaming if a file already exists.
So, I noticed inactivity with the bot and realised that the dyno (server) sleeps after some minutes of inactivity. So, if the bot is mentioned during sleep, it doesn’t respond. The quick fix for this is to ping the website intermittently to keep the server alive. Heroku says free dynos sleep after 30 minutes of inactivity, so I set up an external cronjob to ping the server every 15 minutes.