DEV Community

Ayaka Hara
Ayaka Hara

Posted on

How to Easily Build a PDF Chatbot with RAG (Retrieval-Augmented Generation) Using Azure AI Studio's Prompt Flow

Developing a chatbot that can answer questions about PDF documents might seem like a task requiring extensive time and effort. However, with Azure AI Studio's Prompt Flow, the implementation becomes surprisingly straightforward. In this blog, we will delve into the process of creating a chatbot with RAG (Retrieval-Augmented Generation) that can respond to queries related to sample PDF data. We’ll also guide you through testing this chatbot using the chat feature in Azure AI Studio.


This post is a part of a series that serves as a step-by-step guide to developing a chatbot with RAG:


Table of Contents

Prerequisites

1. Create a project on Azure AI Studio

1-1. Build your own copilot

Access to ai.azure.com and make sure that the right directory (subscription) is selected.
Image description

Select "Build your own copilot".
Image description

Then, click "Create a new project".
Image description

1-2. Configure project details

Input project name and select "Create a new resource", then click next.
Image description

1-3. Create an Azure AI resource for your projects

Input resource name(Your Azure AI resource name must be different from your project name) and select an appropriate Azure location, then next.
Image description

1-4. Review and create a project

Click "Create a project" in Review and finish pane
Image description

It will take around 2 minutes to deploy all required resources to Azure.

2. Deploy Azure OpenAI models

Once deploying all resources to Azure, you will automatically navigate to Playground in Azure AI Studio.
At the same time, you will find the information message "No deployment exists: You need a deployment to work in the playground. Navigate to the Deployment page to create a deployment." Please click the link "Deployment page".
Image description

You will need at least two models including one for embedding.

2-1. Create a new deployment

Move to "Deployment" pane and click "Create".

2-2. Select a model

Select a model:text-embedding-ada-002, then Confirm
Image description

2-3. Deploy model

Finally deploy a selected model.
Image description

Repeat 2-1 to 2-3 above for other model (e.g. gpt-35-turbo) as well.

Once done with deploying models, they are listed in the Deployment pane.
Image description

3. Create an index on Azure AI Search

3-1. Select your dataset

Select "Upload files/folders" as Data source,
Image description
and then click "Upload" > "Upload files"
Image description

3-2. Configure index storage

Select the following settings:

  • "Connect other Azure AI Search resource"
  • Azure subscription which you deployed the Azure AI Search on
  • Azure AI Search service which you deployed already. Image description

3-3. Configure search settings

Confirm the acknowledgement
Image description

3-4. Configure index settings

Insert index name and select virtual machine* (e.g. auto select)
*Selected virtual machine will be used to run indexing jobs.
Image description

3-5. Review and create an index

Finally click "Create"
Image description

It will take around 10 minutes to get done with all jobs.
If you want to know what is happening behind, "job details" navigates you to Azure ML Studio for more details.

Image description
Image description

Once all jobs are completed you can find the index with "ready" sign.
Image description

4. Configure Prompt Flow

If you move to Prompt flow pane you can find "(your-index-name)-sample-flow" in the flow list.
Image description

Once you select the flow you may notice that the basic flow is already prepared. However, there are still some manual configuration required.
Image description

4-1. Create runtime

Create runtime by simply selecting "automatic runtime start".
It will take 5 minutes or so.
Image description

4-2. Configure parameters in each node

  • modify_query_with_history : Select deployment name and set max token (e.g. 1000), then click "Validate and purse input"
    Image description

  • embed_the_question : Click "Validate and purse input"

Note : Perhaps a deployment name will be removed due to some reasons after clicking validate button. Please make sure that the right deployment name for embedding is set.

Image description

  • search_question_from_indexed_docs : Click "Validate and purse input"
    Image description

  • generate_prompt_context : Click "Validate and purse input"
    Image description

  • Prompt_variants : : Click "Validate and purse input"
    Image description

  • answer_the_question_with_context : Select deployment name and set max token (e.g. 1000), then click "Validate and purse input"
    Image description

4-3. Save all configuration changes

Image description

5. Try Chat!

Finally it's time to chat with your PDFs.
Click "Chat" button,
Image description
and then put your question in the chat!
Image description

The answer will be returned in the same language with your input.
Image description

Conclusion

In this blog, we have introduced how to implement a chatbot with RAG (Retrieval-Augmented Generation) that answers questions about sample PDF data using Azure AI Studio's Prompt Flow, and how to practically test it within Azure AI Studio's chat feature. Next in our series, "How to Evaluate a PDF Chatbot Response with Prompt Flow", we will delve into methods for evaluating the performance of the chatbot we've just implemented.

Top comments (0)