Mahmoud Mabrouk for Agenta AI

Posted on Sep 6, 2023

🔥🤖 Build an AI-Powered Discord Bot to Recommend HackerNews Posts using OpenAI, Novu and Agenta 🚀

#ai #chatgpt #tutorial #python

TL;DR

In this tutorial, you'll learn how to create an AI agent that alerts you about relevant Hacker News posts tailored to your interests. The agent sends Discord notifications whenever a post matches your criteria.

We'll write the code using Python, use Beautifulsoup for web scraping, use OpenAI with Agenta for building the AI, and Novu for Slack notifications.

Goal? No more endless scrolling on Hacker News. Let the AI bring worthy posts to you!

Agenta: The Open-source LLM app builder 🤖

A bit about us: Agenta is an end-to-end open-source LLM app builder. It enables you to quickly build, experiment, evaluate, and deploy LLM apps as APIs. You can use it by writing code in Langchain or any other framework, or directly from the UI.

Here's the plan 📝

We will write a script that does the following:

Scrape the first five Hacker News pages for post titles using Python and Beautifulsoup.
Use Agenta and GPT3.5 to categorize posts based on your interests.
Send compelling posts to your Slack channel.

Setting things up:

To get started, let's create a project folder and run Poetry. If you aren't familiar with Poetry, you should check it out as it provides an alternative to virtual environments that is much easier to use.

mkdir hnbot; cd hnbot/
poetry init

This command will guide you through creating your pyproject.toml config.


Would you like to define your main dependencies interactively? (yes/no) [yes] yes

Package to add or search for (leave blank to skip): novu
Add a package (leave blank to skip): beautifulsoup4


Do you confirm generation? (yes/no) [yes] yes

Follow the prompts to set up your project. Don't forget to install novu and beautifulsoup4.

Now let's create the folder for our package and initialize the poetry environement

% mkdir hn_bot; cd hn_bot
% cd hn_bot 
% poetry shell
(hn-bot-py3.9) (base) % poetry install

Now, we have a local environment where:

All our requirements are installed.
We have a Python package called hn_bot that is in our Python lib.

This means if we have multiple files in our library, we can import them using import hn_bot.module_name.

Scraping Hacker News Posts

Scraping the Hacker News page is straightforward since it does not use any complicated JavaScript. The pages are located at https://news.ycombinator.com/?p=pagenumber.

To find the titles and links on the page, we just need to open the web browser and access the dev console. Once there, we can check if there are any elements we can use to locate the titles and links. Luckily, it seems that every post is a span with the class "titleline."

We can use this to extract information from a single page. Let's write a function that extracts titles and links from Hacker News.

# hn_scraper.py
from typing import Dict, List

import requests
from bs4 import BeautifulSoup

def scrape_page(page_number: str) -> List[Dict[str, str]]:
    response = requests.get(f"https://news.ycombinator.com/news?p={page_number}")
    yc_web_page = response.text
    soup = BeautifulSoup(yc_web_page, 'html.parser')

    articles = []

    for article_tag in soup.find_all(name="span", class_="titleline"):
        title = article_tag.getText()
        link = article_tag.find("a")["href"]
        articles.append({"title": title, "link": link})

    return articles

We can test it by adding a print(scrape_page(1)) at the end of the script and running it on shell:

 % python hn_scraper.py
(['Linux Network Performance Parameters Explained (github.com/leandromoreira)', 'Double Commander – Changes in version 1.1.0 (github.com/doublecmd)', 'If You’ve Got a New Car, It’s a Data Privacy Nightmare (gizmodo.com)', 'Ask HN: I’m an FCC Commissioner proposing regulation of IoT security updates', 'Gcsfuse: A user-space file system for interacting with Google Cloud S

Congratulations!🎉 Now we have a script that scrapes post titles from HackerNews

Creating the AI Agent 🤖

Now that we have a list of posts, we need to use OpenAI gpt models to classify whether they are relevant based on the user's interests. For this, we are going to use Agenta.

Agenta allows you to create LLM apps from code or from the UI. Since our LLM app today is quite simple, we will create it from the UI.

Agenta can be self-hosted, however to get started quickly we'll use demo.agenta.ai.

Since our LLM app today is quite simple, we will just go ahead and create it from the UI.

You can self host agenta (Check out docs for that here (https://docs.agenta.ai/installation/local-installation/local-installation) or use the cloud-hosted demo. To get started quickly we'll do the later.

Let's go to demo.agenta.ai and login.

First, let's create a new app by clicking on "Create New App".

Then we select start from template

And use a single prompt template

Doing some Prompt Engineering In Agenta 🪄 ✨

Now we have a playground for creating the app.

First, let's add the inputs for our application. In this case, we will be using "title" for the Hacker News title and "interests" for the user's interests.

Next, we need to do a little prompt engineering. Since we are using gpt3.5 (the cheapest variant in OpenAI). It takes two messages: the system message and the user message. We can use the system message to guide the language model to answer in a certain way, while the prompt prompts the human to give the parameters of the task.

In this case, I tried a simple prompt for the system that ensures the answer is either "True" or "False." For the human prompt, I just asked the system to classify. Note that we used the fstring usual format to inject the inputs that we have added into the prompt.

Now we can then test the application with some examples of Hacker News titles:

Agenta provides tools to systematically evaluate applications and optimize prompts, parameters, and workflows (in case we are using something more complex with embeddings and retrieval augmented generations). However, in this case, such evaluation is unnecessary. The app itself is very simple, and gpt3.5 is able to solve the classification problem with minimal effort.

Let's save our changes

Then deploy the application as an API.

For this we jump to the endpoints menu and copy paste the code snippet to our code.

Wrapping it up 🌯

Now, we can create a function based on this code snippet.

# llm_classifier.py
import requests
import json

def classify_post(title: str, interests: str) -> bool:

    url = "https://demo.agenta.ai/64f1d1aefeebd024bbdb1ea4/hn_bot/v1/generate"
    params = {
        "inputs": {
            "title": title,
            "interests": interests
        },
        "temperature": 0,
        "model": "gpt-3.5-turbo",
        "maximum_length": 100,
        "prompt_system": "You are an expert in classification. You answer only with True or False.",
        "prompt_human": "Classify whether this hackernews post is interesting for someone with the following interests:\nHacker news post title: {title}\nInterests: {interests}",
        "stop_sequence": "\n",
        "top_p": 1,
        "frequence_penalty": 0,
        "presence_penalty": 0
    }

    response = requests.post(url, json=params)

    data = response.json()

    return bool(data)

Sending a Discord message 🎮

First we need to create a new channel in Discord

Next we need to create a webhook and copy the url

Now we need to setup the integration in Novu. For this we have to go to the Integration Store, click on “Add a provider”, select Discord, and don't forget to activate it!

Last, we need to create a workflow that triggers the message to be sent to our Discord. We will add the {{content}} variable to the message which we will later inject using the code.

Write the messaging function

Now it's time to write the message that will trigger the workflow

# novu_bot
from novu.config import NovuConfig
from novu.api import EventApi
from novu.api.subscriber import SubscriberApi
from novu.dto.subscriber import SubscriberDto

NovuConfig().configure("https://api.novu.co", "YOUR_API_KEY")
webhook_url = "..." # the webhook url we got from Discord

def send_message(msg):
    your_subscriber_id = "123"  # Replace this with a unique user ID.

    # Define a subscriber instance
    subscriber = SubscriberDto(
        subscriber_id=your_subscriber_id,
        email="abc@gmail.com",
        first_name="John",
        last_name="Doe"
    )

    SubscriberApi().create(subscriber)
    SubscriberApi().credentials(subscriber_id=your_subscriber_id,
                                provider_id="discord",

    EventApi().trigger(
        name="slackbot",  # The trigger ID of the workflow. It can be found on the workflow page.
        recipients=your_subscriber_id,
        payload={},  # Your Novu payload goes here
    )

Putting everything together

Now we're ready to assemble all the elements to get our AI assistant running.

Let's create an app.py file in which we first call the scraper, then the LLM classifier, and finally send a message with the interesting posts.

from hn_bot import hn_scraper, llm_classifier, novu_bot
import schedule
import time

interests = "LLMs, LLMOps, Python, Infrastructure, Tennis, MLOps, Data science, AI, startups, Computational Biology"

def main():
    novu_bot.send_message("Interesting posts at HackerNews:\n")

    posts = hn_scraper.scrape_page("1")
    for title, url in posts:
        if llm_classifier.classify_post(title, interests) == "True":
            novu_bot.send_message(f"{title}\n{url}")

if __name__ == "__main__":
    main()

Et voila, we have all the interesting posts coming up in our Discord.

Finally let's schedule this to run each hour ⏰

We would like to run the script to check new posts each hour. For this we need to add the python library schedule

from hn_bot import hn_scraper, llm_classifier, novu_bot
from time import sleep
interests = "LLM, LLMOps, MLOps, Data science, AI, startups"

done_post_titles = []


def main():
    novu_bot.send_message("Interesting posts at HackerNews:\n")

    posts = []
    for i in range(1, 5):
        posts += hn_scraper.scrape_page(i)
    for post in posts:
        title = post["title"]
        url = post["link"]
        if llm_classifier.classify_post(title, interests) and title not in done_post_titles:
            done_post_titles.append(title)
            novu_bot.send_message(f"{title}\n{url}")


if __name__ == "__main__":
    while True:
        main()
        sleep(3600)

Congratulations on making it thus far!🎉 You've now got an automated AI assistant keeping an eye on Hacker News for you.

Summary 📜

In this tutorial, we've built an AI-powered assistant to keep you in the loop with relevant Hacker News posts. You should have learned:

How to use Beautifulsoup for scraping hackernews
How to create an LLM app based on one prompt using Agenta and OpenAI gpt3.5
How to send notifications on Discord using Novu

You can check the code at this https://github.com/Agenta-AI/blog/tree/main/hackernews-bot

Thanks for reading!

Top comments (5)

Nevo David • Sep 7 '23

This is so cool!

Mahmoud Mabrouk Agenta AI • Sep 7 '23

Thanks @nevodavid !

Akrem • Sep 7 '23

do we need an openai key? if yes where do we put this ? also in code?

Mahmoud Mabrouk Agenta AI • Sep 7 '23

For using the demo cloud version (demo.agenta.ai) you do not need an OpenAI API key (it's provided by us). If you are using the self-installed version, you'll need to provide yours when creating a new application in the UI.

lexa • Sep 7 '23

Great article! I appreciate your detailed explanation of how to build an AI-powered Discord bot for recommending HackerNews posts using OpenAI, Novu, and Agenta. It's fascinating to see how AI and NLP tools can be leveraged to enhance user experiences within chat platforms like Discord. The step-by-step guide you've provided makes it accessible for developers to create their own AI-driven bots. Looking forward to exploring the potential of these tools in more projects