In today's digital age, messaging platforms have become integral to our daily communication. WhatsApp, with its vast user base and powerful features, offers an excellent opportunity to build intelligent chatbots that can provide instant and personalized assistance. In this article, we will explore how to leverage the power of Flask, Twilio WhatsApp, Langcain and ChatGPT models to create an intelligent WhatsApp bot that excels in PDF question-answering.
Twilio, a leading cloud communications platform, provides an array of tools and services to integrate messaging capabilities into your applications seamlessly. By combining Twilio's messaging API with Flask, a lightweight and flexible Python web framework, we can develop a robust backend for our WhatsApp bot.
To tackle the challenge of processing PDF files and extracting relevant information, we will dive into the realms of language models and natural language processing. OpenAI's advanced language models, such as GPT-3.5 Turbo, offer state-of-the-art capabilities for understanding and generating human-like text. By harnessing the power of OpenAI models using Langcain, we can transform our bot into a knowledgeable assistant that can answer questions based on the contents of uploaded PDF files.
Throughout this article, I will guide you step-by-step, drawing inspiration from Twilio's comprehensive publication guides. You will learn how to set up your development environment, handle incoming messages from WhatsApp, process PDF files, generate document embeddings, and perform question-answering tasks using Twilio, Flask, and powerful language models.
By the end of this tutorial, you will have a fully functional WhatsApp bot capable of providing accurate and insightful answers to questions posed by users. This opens up exciting possibilities for customer support, information retrieval, and automation, empowering businesses and individuals with an intelligent conversational agent at their fingertips.
So, let's dive into the world of Flask, Twilio, OpenAI, and Langchain and embark on a journey to build a remarkable WhatsApp bot that revolutionizes the way we interact with PDFs and obtain instant knowledge.
Python: Make sure you have Python installed on your system.
Twilio Account: In order to send and receive messages, it is necessary to have a Twilio account. You will require an AUTH_TOKEN, Twilio phone number and ACCOUNT_SID for this purpose.
OpenAI API Key: To use OpenAI's GPT-3.5 model, you need an API key.
Ngrok: Ngrok is a tool that allows us to expose our local Flask server to the internet.
To get started you'll need the following packages:
- dotenv Library: Aloows you to load environment variables in your app. Install using
pip install python-dotenv
- Langchain: Allows you to use multiple tools for building AI powered apps. Install using
pip install langchain
- PyPDF2 Library: You will use the PyPDF2 library to read and extract text from PDF files. Install using
pip install PyPDF2
- Twilio Python Library: Allows you to interact with the twilio api in python. Install using
pip install twilio
- OpenAI: Enables you to have access to OpenAI's GPT models. Install using
pip install openai
However, after installing this packages, you might be prompted to install additional packages in order for you to run the app.
To illustrate the process, you will configure your Twilio account to utilize WhatsApp by utilizing the Twilio Sandbox for WhatsApp. Access the WhatsApp Sandbox within your Twilio Console by navigating to the Messaging section on the left sidebar (if you don't see it, click on Explore Products to reveal the product list, where you can find Messaging). Next, expand the "Try it out" dropdown and select "Send a WhatsApp message" from the options. You will then see this:
We will then go into our code editor and create an app.py file and a .env file. In our .env file we will go ahead and include the following:
TWILIO_ACCOUNT_SID = xxxxxxxx TWILIO_AUTH_TOKEN = xxxxxxxx TWILIO_PHONE_NUMBER = xxxxxxxx OPENAI_API_KEY = xxxxxxxx
In the pre-requisites, you should have named the respective tokens, which you will now substitute in place of
Now you've added all your tokens to your .env file and created an app.py file, you will go ahead and import all your dependencies
from flask import Flask, request import os from twilio.twiml.messaging_response import MessagingResponse from twilio.rest import Client from dotenv import load_dotenv from langchain.text_splitter import RecursiveCharacterTextSplitter from langchain.embeddings.openai import OpenAIEmbeddings import tempfile from PyPDF2 import PdfReader from langchain.vectorstores import FAISS from langchain.llms import OpenAI from langchain.chains.question_answering import load_qa_chain app = Flask(__name__) @app.route("/message", methods=["POST", "GET"]) def message(): return "Hello, world" if __name__ == "__main__": app.run(debug=True)
So, the above is the bare bones of the PDF Q and A bot,
MessagingResponse to send messages in twilio,
Client to access our twilio account,
dotenv to access environment variables,
RecursiveCharacterTextSplitter to split the texts in the uploaded texts,
OpenAIEmbeddings to create word embeddings,
requests to get our PDF file from twilio,
templfile to create a temporary directory to store the uploaded PDF ,
PdfReader to be able to get read data from our uploaded pdf file,
FAISS to create a vector store for similar texts of your questions in your PDF,
load_qa_chain to create a question and answer model.
Next, append this code in the message() function:
load_dotenv() OPENAI_API_KEY = os.getenv('OPENAI_API_KEY') response = None account_sid = os.getenv('TWILIO_ACCOUNT_SID') auth_token = os.getenv('TWILIO_AUTH_TOKEN') client = Client(account_sid, auth_token) twilio_phone_number = os.getenv('TWILIO_PHONE_NUMBER') sender_phone_number = request.values.get('From', '') pdf_url = request.values.get('MediaUrl0') response = None
This code is used to establish a connection with the Twilio client by retrieving environment variables. Also extracting important information from the sender, specifically the PDF URL, which will be requested later as it is stored in an S3 bucket.
Now you'd add this variables at the top of the
@app.route as they will act as as global variables to be accessed later.
pdf_exists = False VectorStore = None
if media_content_type == 'application/pdf': global pdf_exists, VectorStore pdf_exists = True response = requests.get(pdf_url) with tempfile.NamedTemporaryFile(suffix='.pdf', delete=False) as temp_file: temp_file.write(response.content) temp_file_path = temp_file.name pdf = PdfReader(temp_file_path) text = "" for page in pdf.pages: text += page.extract_text() text_splitter = RecursiveCharacterTextSplitter( chunk_size = 1000, chunk_overlap = 200, length_function = len ) chunks = text_splitter.split_text(text=text) embeddings = OpenAIEmbeddings() VectorStore = FAISS.from_texts(chunks, embedding=embeddings) response = "Recieved, You can now ask your Questions"
Firstly, the code verifies whether a PDF file has been received. If a PDF file is detected, the global variable called "pdf_exists" is set to true. Next, the code sends a request to the PDF's URL and retrieves the file. The file is temporarily stored in a directory and its contents are read. Then, the code iterates through the pages of the PDF, dividing the text into segments of 1000 words with an overlap of 200 words.
Afterwards, the code utilizes the
OpenAIEmbeddings function to generate embeddings for the text segments. These embeddings are then passed into a VectorStore. Finally, a notification message is sent to indicate that the code is ready to answer questions related to the PDF.
elif pdf_exists: question = request.values.get('Body') if pdf_exists: docs = VectorStore.similarity_search(query=question, k=3) llm = OpenAI(model_name="gpt-3.5-turbo", temperature=0.4) chain = load_qa_chain(llm, chain_type="stuff") answer = chain.run(input_documents=docs, question=question) message = client.messages.create( body=answer, from_=twilio_phone_number, to=sender_phone_number ) return str(message.sid) else: response = "No PDF file uploaded."
The provided code begins by checking if a text was received. If a text was indeed received, it further checks if a PDF file was previously sent by examining the variable pdf_exists. Following this, the code utilizes the VectorStore to search for similar texts based on the question provided. It then employs the gpt-3.5-turbo model to generate an answer based on the retrieved information. The generated answer is subsequently sent as a message, and the message SID (unique identifier) is returned.
However, if a text was sent but no PDF file was uploaded beforehand, the code sends a response stating "No PDF file uploaded."
else: print(media_content_type) response = "The media content type is not application/pdf" print(media_content_type) message = client.messages.create( body=response, from_=twilio_phone_number, to=sender_phone_number ) return str(message.sid)
If the conditions in the if and elif statements are not met, the code will respond with a message stating "The media content type is not application/pdf."
Now go ahead to your terminal and run
python app.py and our app will be running on
localhost:5000. Now you can go ahead to Ngrok and run
ngrok http 5000 so that you can send and receive WhatsApp messages. We should see something like this
Now, copy the circled link, go back to your Twilio sandbox settings and paste it there
There you have it, you can now upload PDFs to your WhatsApp bot and ask it questions
In conclusion, you have gained insights into various aspects of the provided code. You have explored its functionality, how it verifies PDF files, retrieves and processes their contents, generates embeddings, and uses models to answer questions. Furthermore, we have considered situations where specific response messages are triggered when certain requirements are not fulfilled. By comprehending these details, you now have a better understanding of the code's overall behavior and its outcomes in different scenarios.