DEV Community

Shubhank92
Shubhank92

Posted on

Creating a Chatbot using the data stored in my huge database

Hello Everyone,

I want to build a custom chat bot which can answer questions based on the data in my databse
Below are my tries and the problems I am facing
I am open for all suggestions, so please do help me

Tried without using langchain

The code establishes a connection to a PostgreSQL database and prompts the user for information they want to obtain.
It then generates an SQL query based on the (user input + the table names of the db) using the OpenAI GPT-3.5 language model.
The code extracts table names from the generated query and fetches column information from the connected database. It generates a prompt that includes the table names and column details, and uses the GPT-3.5 model to generate a final SQL query based on this prompt.
The final SQL query is executed on the database, and the results are printed. (currently )
Overall, the code utilizes natural language processing and database interactions to generate and execute SQL queries based on user input.

Tried using Langchain
import os
from langchain import OpenAI, SQLDatabase
from langchain.chains import SQLDatabaseSequentialChain

os.environ['OPENAI_API_KEY'] = "****"

dburi = 'postgresql://postgres:*@:/*'
db = SQLDatabase.from_uri(dburi)

llm = OpenAI(temperature=0, model='text-curie-001')

llm = OpenAI(temperature=0)
db_chain = SQLDatabaseSequentialChain(llm=llm, database=db, verbose=True)

resp = db_chain.run('what is my last po value for testaccount')
print(resp)

the problem I have Faced is that the prompt size is getting to 1,29,300+ Tokens some how
I am unable to figure it out why it is happening
I tried custom prompt templates also but that did not decrease the prompt size
What I felt is that they just add my custom prompt data to their prompt and send to the open ai api instead of just sending my custom prompt

So if any one can help me in any way around, pls do

Other than these two methods I have seen that there is something known as fine tuning and embeddings
I know how fine tuning is done using the Open AI but I don’t have much Idea of Embeddings
I want to know which one is better to use
As far as what I have re searched I came to know that in both cases we have to give all the database information to them
which is not secure I think for my organization as my user privacy will be at risk

So finally what could be a better way to build a bot that can answer my questions based on the information in my db

Top comments (0)