Aurelia Specker for XDevelopers

Posted on Nov 14, 2019

Using the Twitter API to make your commute easier

#tutorial #python #twitter

The code for this tutorial is available on GitHub.

I use London’s TfL Metropolitan line on a daily basis to commute to and from work. Being aware of disruptions as they occur helps me save time on a daily basis. Luckily, the official Twitter account of the Metropolitan line (@metline) sends accurate information about the status of the line.

I had originally enabled notifications for the @metline account, so that each time @metline sent a new Tweet, I would receive a push notification on my phone. But I ended up receiving too many notifications and many weren’t relevant to me.

I needed a way to filter Tweets of interest and only receive notifications for information that is relevant to my personal commuting pattern.

Enter... 🥁🥁🥁 the Twitter API!

We’ll be using two different endpoints:

The Premium 30-Day Search API (you can use the Sandbox version for free) to find the Tweets that we’re interested in.
The POST statuses/update endpoint to create a new Tweet that will notify you of potential disruptions. This endpoint is available for free.

Set up

In order to reproduce this application, you will need the following:

Choose a Twitter account that Tweets information of interest.
In this example, we will use @metline for Tube line information.
Twitter account(s) of the notification recipient(s), in other words the commuter(s). For this, I used my primary Twitter account (@AureliaSpecker).
Twitter account of the notification sender. For this, I used a secondary Twitter account (@maddie_testing).
Twitter Developer account: if you don’t have one already, you can apply for one. The developer account should be linked to the Twitter account of the notification sender (in this example: @maddie_testing).
Create a Twitter Developer app.
Access keys and tokens for the app you created in the step above.
Set up your dev environment for "Search Tweets: 30-Days Sandbox" to something like dev or prod.
Have pip installed on your laptop.
Python version 3.6 or above (in your command line, you can check what version of Python is installed by running $ python --version).

From your command line, create a new directory for your project.

$ mkdir commute-alerts

Navigate into it and create the following two files:

$ cd commute-alerts
$ touch alerts.py
$ touch credentials.yaml

At this point, your project tree should look something like this (you can install tree with $ sudo install tree and then use $ tree to get the following output):

└── alerts.py
└── credentials.yaml

If you plan on pushing your project to GitHub, you should also add a .gitignore file in your project directory:

$ touch .gitignore

It is very important that you carefully guard your access keys and tokens, as they represent your unique access to the Twitter API. Keeping these in a separate file (credentials.yaml) will ensure that you exclude them from source and that no one steals them from you. Make sure to add the following line to your .gitignore file:

credentials.yaml

Check that you have the necessary dependencies installed, by running the following in your command line:

$ pip install searchtweets
$ pip install requests_oauthlib
$ pip install pandas

Open alerts.py in your preferred code editor (I use Visual Studio Code) and add the following lines of code to import the required modules:

from searchtweets import ResultStream, gen_rule_payload, load_credentials, collect_results
from requests_oauthlib import OAuth1Session
import yaml
import json
import datetime as dt
import pandas as pd

Handling credentials

Now open credentials.yaml and add your access keys and tokens in the format below. Replace {ENV} with the dev environment name you previously created and insert your access keys and tokens for each field accordingly.

search_tweets_api:
  account_type: premium
  endpoint: https://api.twitter.com/1.1/tweets/search/30day/{ENV}.json
  consumer_key: XXXXXXXXXX
  consumer_secret: XXXXXXXXXX
  access_token: XXXXXXXXXX
  access_token_secret: XXXXXXXXXX

Back in alerts.py, use load_credentials from the Search Tweet Python wrapper. This will automatically handle credentials for you:

creds = load_credentials(filename="./credentials.yaml",
                        yaml_key="search_tweets_api",
                        env_overwrite=False)

For the POST statuses/update endpoint, we will manually load the credentials and store them into a new set of variables.

with open('./credentials.yaml') as file:
    data = yaml.safe_load(file)

consumer_key = data["search_tweets_api"]["consumer_key"]
consumer_secret = data["search_tweets_api"]["consumer_secret"]
access_token = data["search_tweets_api"]["access_token"]
access_token_secret = data["search_tweets_api"]["access_token_secret"]

We will then use OAuth1Session to manage credentials:

oauth = OAuth1Session(
    consumer_key,
    client_secret=consumer_secret,
    resource_owner_key=access_token,
    resource_owner_secret=access_token_secret,
)

Using the Premium 30-Day Search API

This endpoint lets you access the last 30 days of Twitter data. You can find out more about this endpoint in the Twitter Developer documentation.

This endpoint takes four different parameters:

query
fromDate
toDate
maxResults (in other words, the number of results returned per call).

We will use Python’s datetime module to generate a fromDate and a toDate and establish a timeframe during which we want to be searching for Tweets from @metline. In our case, we want to be looking for all Tweets that were created in the past two hours. Note that, with the Twitter API, times are in UTC. Add the following lines of code to alerts.py:

utc = dt.datetime.utcnow() + dt.timedelta(minutes=-1)
utc_time = utc.strftime("%Y%m%d%H%M")
print("toDate:", utc_time)

two_hours = dt.datetime.utcnow() + dt.timedelta(hours=-2, minutes=-1)
two_hours_prior = two_hours.strftime("%Y%m%d%H%M")
print("fromDate:", two_hours_prior)

This will return something like this when you run $ python alerts.py in the command line (which is the format required for the toDate and fromDate parameters):

toDate: 201910071658
fromDate: 201910071458

The Search Tweets Python Wrapper enables us to use the 30Day Search endpoint without too much setup on our side. In this case, we are looking to get all Tweets from @metline that do not contain any mention of other Twitter accounts. Our query is therefore "from:metline -has:mentions". You can read more about building queries in the Twitter Dev documentation. We also want to return no more than 100 Tweets per call (this is the maximum we can have with the free Sandbox access).

rule = gen_rule_payload("from:metline -has:mentions",from_date=str(two_hours_prior), to_date=str(utc_time), results_per_call=100) 
print("rule:", rule)

Here, the print function will return something like this:

rule: {"query": "from:metline -has:mentions", "maxResults": 100, "toDate": "201910071658", "fromDate": "201910071458"}

We store the Tweets returned by this query in a new variable "tweets" and then we use list comprehensions to print out the first 10 lines of Tweets returned by our query:

tweets = collect_results(rule, 
                         max_results=100,
                         result_stream_args=creds)

[print(tweet.created_at_datetime, tweet.all_text, end='\n\n') for tweet in tweets[0:10]];

This will return something like the below. Note that if no Tweets were created in the previous two hours, nothing will be returned.

2019-10-07 05:06:17 Minor delays between Uxbridge and Harrow-on-the-Hill, southbound only, due to a temporary shortage of train operators. Good service on the rest of the line.

2019-10-07 04:36:30 Minor delays between Harrow-on-the-Hill and Watford / Rayners Lane northbound only due to the temporary unavailability of train operators. Good service on the rest of the line.

2019-10-06 22:27:02 A good service is now operating to all destinations.

2019-10-06 21:49:52 Minor delays between Chesham/Amersham and Rickmansworth, southbound only, due to a signal failure at Chalfont &amp; Latimer.

Setting up trigger words and notifications

We need to establish what words will trigger a notification for the commuter(s). We will use Python sets to do this.

In this example, there are two commuters: David and Aurelia. Certain words will trigger a notification for both commuters, whereas other words will only trigger a notification for one of the commuters.

all_trigger = {'closure', 'wembley', 'delays', 'disruption', 'cancelled', 'sorry', 'stadium'}

david_trigger = {'hillingdon', 'harrow'}

aurelia_trigger = {'baker'}

Note that if you want to add strings made up of two words in these sets you will have to add an extra level of data processing with REs (this is not included in this tutorial).

Next, we create two new lists in which to store the Tweet dates and Tweet text returned by the call above:

tweet_text = []
tweet_date = []

We will also want all of the words included in the Tweets from the previous two hours to be concatenated in one string:

combined_tweet_text = ' '

We then use a For Loop to populate the lists we created with the Tweet text and Tweet dates, and to concatenate the text from the different Tweets into one string.

for tweet in tweets: 
    tweet_text.append(tweet.all_text)
    tweet_date.append(tweet.created_at_datetime)
    combined_tweet_text += tweet.all_text

The following line creates a new set containing all of the words in lowercase from Tweets created in the past two hours:

tweet_words = set(combined_tweet_text.lower().split())

We then use an If statement, as well as the Python len() function and set intersection() method to define whether a notification has to be sent or not:

if len(tweet_words.intersection(all_trigger)) != 0: 
    message = "@AureliaSpecker & @_dormrod 👋 check https://twitter.com/metline for possible delays, [{}]".format(utc_time)
elif len(tweet_words.intersection(david_trigger)) != 0: 
    message = "@_dormrod 👋 Check https://twitter.com/metline for possible delays, [{}]".format(utc_time)
elif len(tweet_words.intersection(aurelia_trigger)) != 0:
    message = "@AureliaSpecker 👋 Check https://twitter.com/metline for possible delays, [{}]".format(utc_time)
else:
    message = "There are no delays"
    pass

You can print the message output, to check that the correct information is returned when you run $ python alerts.py:

print("Message:", message)

Using the POST statuses/update endpoint to post on Twitter

The final step involves sending the notification, in the form of a new Tweet, when one of the trigger words appears in the Tweet text.

We use Twitter’s POST statuses/update endpoint and set the parameter "status" (i.e. the Tweet body) to our variable "message" defined above:

params = {"status": message}

oauth.post(
    "https://api.twitter.com/1.1/statuses/update.json", params=params
)

As long as the receiving Twitter account(s) have notifications enabled, the commuter(s) will receive a notification every time their @username is included in the Tweet body.

Deploying your app with DigitalOcean

I chose to use DigitalOcean to deploy this app to a remote server.

Login to your DigitalOcean account (or sign up for one)
Create a new project and, within that project, create a new droplet by selecting "Create" on the top right hand side corner. Here is a good tutorial on how to create new droplets.

A "droplet" is a Linux-based virtual machine. In other words, each new droplet you create acts as a new server you can use to deploy projects.

Once you’ve created a droplet, you will have to connect to your droplet. This tutorial explains how you can do this.

Once you have connected to the droplet (i.e. you are connected to the server) you will have to upload your project to that server. You can think of your droplet (or server) as another local machine on which you can add your project by cloning the repository from GitHub.

If you haven’t already, add your project to GitHub by following this tutorial. Make sure to add your credentials.yaml file to your .gitignore file, as described above. Remember not to accidentally push any credentials to GitHub. Check out this guide on how to secure your keys and access tokens.
This article explains how to clone GitHub repositories.
Once you’ve added your project, you will have to create a new credentials.yaml file (as the one you previously did) and save it in the project folder. You can use vim or nano as a text editor to add your credentials to the file:

# touch credentials.yaml
# vim credentials.yaml

Type ESC :wq to save your changes and quit VIM mode.

Also make sure to add a .gitignore file and add credentials.yaml to that file, to make sure you don’t accidentally commit any access keys and tokens:

# touch .gitignore
# vim .gitignore

Once again, you can use ESC :wq to save your changes and quit VIM mode.

Going forward, if you want to pull any changes from GitHub, you can run the following in your project directory:

# git pull

Then, install all required dependencies (as described previously) just as you would if you were on another local machine.
You’re now ready to run your app:

# python3 alerts.py

Congratulations! 🥳 Your app is now installed on a remote server.

Scheduling your app to automatically run with Cron

Cron allows you to schedule your app to run automatically at certain times and on certain days. For example, I only want alerts.py to run on weekdays, at 6.30am and at 4.30pm.

I used the following tutorials to learn how to set up a Cron job:

Here are the steps I followed:

On your remote server, open alerts.py with vim and change the two mentions of ./credentials.yaml to root/met-line-alerts/credentials.yaml
Install crontab-python: $ pip install python-crontab
Create a new file cron.txt and open it with vim:

# touch cron.txt
# vim cron.txt

In that file, add the following line (type the letter "i" to insert text in vim):

30 6,16 * * 1-5 python3 /root/met-line-alerts/alerts.py

Save and quit vim mode with ESC :wq

This line of code basically tells Cron to execute the command python3 /root/met-line-alerts/alerts.py at minute 30 past hour 6 and 16 on every day-of-week from Monday through Friday.

You can check your Cron schedule expressions on this website: https://crontab.guru/

You should also make sure to check the time on your Linux remote server, to ensure that you’re setting the time in the correct time zone. You can do this by typing the date command: # date

Run the following command to initiate the Cron job:

# crontab cron.txt

You can list your active Cron jobs with:

# crontab -l

You can remove existing Cron jobs with:

# crontab -r

Conclusion

Hopefully this tutorial inspires you to build with the Twitter API to solve problems in your daily life. I used several libraries and services beyond the Twitter API to make this tutorial, but you may have different needs and requirements and should evaluate whether those tools are right for you.

Let us know if this inspires you to build anything on the Twitter community forums or by Tweeting us at @TwitterDev. You can also give us feedback on our Feedback platform.