DEV Community

Mirzokhid Mukhsidov
Mirzokhid Mukhsidov

Posted on

Web Scraper with Python (Beautiful Soup) & Deployment of it into Heroku [Part2]

After writing the code portion of my project and testing it, I pushed it into the Heroku server. Since running the program regularly manually might get tedious over time I scheduled it (a.k.a cron job) so it runs automatically at a given time (every day in my case). Turns out Heroku does not allow unverified users (here is how to verify your account) to use add-ons so I scheduled it manually with the python schedule module. Later on, after I verified my account with a credit card I was able to use the Heroku Scheduler. In this post we will go through both of the ways. However, first we have to connect PostgreSQL to your database in Python.


Connect Python to Postgresql
Connecting in Python describes connecting to the database in the Heroku server with PostgreSQL. First you should install psycopg2 package
pip install psycopg2-binary
then connect to DATABASE_URL with this package

import os
import psycopg2

DATABASE_URL = os.environ['DATABASE_URL']

conn = psycopg2.connect(DATABASE_URL, sslmode='require')
Enter fullscreen mode Exit fullscreen mode



Scheduling with Python Schedule
Python schedule module, as the name suggests, runs Python functions (or any other callable) periodically using a friendly syntax.

We install it with the command:
$ pip install schedule

Import schedule and time module:

import schedule
import time
Enter fullscreen mode Exit fullscreen mode

Define a function:

def function_name():
    # ToDo

schedule.every(10).minutes.do(function_name)
schedule.every().hour.do(function_name)
schedule.every().day.at("10:30").do(function_name)
schedule.every().monday.do(function_name)
schedule.every().wednesday.at("13:15").do(function_name)
schedule.every().minute.at(":17").do(function_name)

while True:
    schedule.run_pending()
    time.sleep(1)
Enter fullscreen mode Exit fullscreen mode

Source: https://schedule.readthedocs.io/en/stable/
https://www.youtube.com/watch?v=qquCAgwvL8Q


Pushing the code into the Heroku Server
Heroku is a quite popular cloud platform. On the Getting Started on Heroku with Python you will see in detail how to install Heroku CLI onto your machine and push your project into the server using Git.
Keep in mind that, unlike the tutorial above, Procfile, we must use worker process type!

Procfile



Scheduling with Heroku Scheduler
For a free dyno Heroku gives you 550 hours per month (read more about dynos) plus 450 hours if you verify your account.
Pushing your code into Heroku with Python Schedule might use a lot of free dyno hours.
heroku ps
This is why we will take advantage of the Heroku Scheduler

Go to the "Recources" section of your app
Recources
Find Heroku Scheduler and add it
Search Heroku Scheduler
Click on Heroku Scheduler add-on
Click into Heroku
Create a job for a suitable time period and Save it
Create a job

At the end you might check your work with
heroku logs --tail



Disclaimer!
Starting November 28th, 2022, free Heroku Dynos, free Heroku Postgres, and free Heroku Data for Redis will no longer be available.
More information
https://blog.heroku.com/next-chapter

Oldest comments (0)