DEV Community

Cover image for Building a Document QA with Streamlit & OpenAI
CyprianTinasheAarons
CyprianTinasheAarons

Posted on

Building a Document QA with Streamlit & OpenAI

What is Streamlit? 🚀

Streamlit is an open-source Python framework for data scientists and AI/ML engineers to deliver dynamic data apps with only a few lines of code.

Streamlit is exciting for AI engineers who want to quickly demo or create Proof of concept projects.

Streamlit provides great documentation that is easy to understand, and for any developer to pick up easily. 📈


Some Fundamentals before we dive into our project 🧩

Installation 🛠️

To install Streamlit, we can run the following command:

pip install streamlit
Enter fullscreen mode Exit fullscreen mode

To test if we have installed it successfully, we run the following:

streamlit hello
Enter fullscreen mode Exit fullscreen mode

Once we have built our application script i.e <streamlit_script.py>, we can run it using the following command:

streamlit run <streamlit_script.py>
Enter fullscreen mode Exit fullscreen mode

Displaying Text or Diagrams 📝

Using st.write we can display information in our app:

st.write("hello world")
Enter fullscreen mode Exit fullscreen mode

Text Elements ✍️

We can display strings in different formats, e.g., markdown, title, header, and subheader:

st.markdown("*Streamlit* is **really** ***cool***.")
Enter fullscreen mode Exit fullscreen mode

Widgets 🎛️

Streamlit has many widgets that include buttons, select boxes, checkboxes, etc.:

st.button("Click me")
Enter fullscreen mode Exit fullscreen mode

Layout 🖼️

We can work with sidebars, columns, and expanders. For example, st.sidebar will show a sidebar on our app interface:

st.sidebar.write("I am a sidebar")
Enter fullscreen mode Exit fullscreen mode

👉 Going through the Streamlit docs and cheat sheet will quickly get you updated on the entire syntax:

Hosting a Streamlit App 🌐

Hosting a Streamlit app is very easy when working with Streamlit Cloud:


Prerequisites 📋

  1. You are a Python developer.
  2. You have a basic understanding of Gen AI and LLMs like OpenAI.
  3. You love learning and upskilling.
  4. Your preferred IDE e.g VScode.

A Breakdown of our Document Question & Answer Streamlit application

We start by importing Streamlit and OpenAI into our app.py file:

import streamlit as st
from openai import OpenAI
Enter fullscreen mode Exit fullscreen mode

Next, we make use of st.title and st.write to display the title and description:

st.title("📄 Document Question Answering")
st.write(
    "Upload a document below and ask a question about it – GPT will answer! "
    "To use this app, you need to provide an OpenAI API key, which you can get [here](https://platform.openai.com/account/api-keys). "
)
Enter fullscreen mode Exit fullscreen mode

Image description

Next up, is the st.text_input function by Streamlit to add our OpenAI key giving our application AI capabilities:

openai_api_key = st.text_input("OpenAI API Key", type="password")
Enter fullscreen mode Exit fullscreen mode

Image description

Lastly, when Implementing the core logic for the platform, we start with an if not condition to check if the key exists; otherwise, we show the st.info to ask the user to add the key:

if not openai_api_key:
    st.info("Please add your OpenAI API key to continue.", icon="🗝️")
Enter fullscreen mode Exit fullscreen mode

Image description

The else condition shows our fully functional Doc QA:

else:

    # Create an OpenAI client.
    client = OpenAI(api_key=openai_api_key)

    # Let the user upload a file via `st.file_uploader`.
    uploaded_file = st.file_uploader(
        "Upload a document (.txt or .md)", type=("txt", "md")
    )

    # Ask the user for a question via `st.text_area`.
    question = st.text_area(
        "Now ask a question about the document!",
        placeholder="Can you give me a short summary?",
        disabled=not uploaded_file,
    )

    if uploaded_file and question:

        # Process the uploaded file and question.
        document = uploaded_file.read().decode()
        messages = [
            {
                "role": "user",
                "content": f"Here's a document: {document} \n\n---\n\n {question}",
            }
        ]

        # Generate an answer using the OpenAI API.
        stream = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=messages,
            stream=True,
        )

        # Stream the response to the app using `st.write_stream`.
        st.write_stream(stream)
Enter fullscreen mode Exit fullscreen mode

Image description

A Breakdown of the Code 🧐

  1. Initializing our OpenAI client using the added OpenAI key:

    client = OpenAI(api_key=openai_api_key)
    
  2. Using file_uploader from Streamlit, we upload our types .txt and .md:

    uploaded_file = st.file_uploader(
        "Upload a document (.txt or .md)", type=("txt", "md")
    )
    
  3. Using text_area, we take the input from the user:

    question = st.text_area(
        "Now ask a question about the document!",
        placeholder="Can you give me a short summary?",
        disabled=not uploaded_file,
    )
    
  4. We implement a condition to check if the user has uploaded a file and inputted a question:

    if uploaded_file and question:
    
  5. We read the file and process what the user uploaded:

    document = uploaded_file.read().decode()
    
  6. We initialize the messages and pass them to our OpenAI chat completions endpoint:

    messages = [
        {
            "role": "user",
            "content": f"Here's a document: {document} \n\n---\n\n {question}",
        }
    ]
    
    # Generate an answer using the OpenAI API.
    stream = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=messages,
        stream=True,
    )
    
  7. Finally, using write_stream, we stream the output:

    st.write_stream(stream)
    

Setting Up the Project Locally on your machine 🏗️

Clone the repository:

git clone git@github.com:CyprianTinasheAarons/document-qa.git
cd document-qa/
Enter fullscreen mode Exit fullscreen mode

Create a virtual environment:

python3 -m venv venv
Enter fullscreen mode Exit fullscreen mode

Activate the environment:

source venv/bin/activate
Enter fullscreen mode Exit fullscreen mode

Install the requirements found in requirements.txt:

pip install -r requirements.txt
Enter fullscreen mode Exit fullscreen mode

Yay!! Now we can run our code:

streamlit run streamlit_app.py
Enter fullscreen mode Exit fullscreen mode

Image description


We navigate to our local URL and add our OpenAI key:

Get your API key here: OpenAI API Keys

Image description

Image description


🎉 Conclusion

Congratulations on getting this far! Now you can go and launch great AI solutions that will make the world better! 🎊

Feel free to follow me on Twitter for more updates and projects. Also, check out my website here. 🌐✨


📚 Resources

Top comments (4)

Collapse
 
anna_lapushner profile image
anna lapushner

We love planet Earth !!! This is a serious post that reminds us to do what we want to do, QA and upskilling is so important! Thank you for the VIP treatment …

Collapse
 
cypriantinasheaarons profile image
CyprianTinasheAarons

Thanks, Anna. SO TRUE!!

Collapse
 
smarak_pani_8d6924a30c268 profile image
smarak pani

Great explanation

Collapse
 
cypriantinasheaarons profile image
CyprianTinasheAarons

thanks smarak