DEV Community

Cohorte
Cohorte

Posted on • Originally published at Medium

Text-to-SQL: Bridging the Gap Between Natural Language and Database Insights

Democratizing data access and empowering users to extract insights from databases effortlessly is a pivotal step in today's data-driven landscape. Introducing the "Text-to-SQL" project, a powerful application that bridges the gap between natural language and complex SQL queries. This project builds upon the foundation of the NA2SQL app, leveraging the capabilities of OpenAI's GPT-3.5 and LlamaIndex to provide an intuitive and accessible way to interact with databases using natural language.

Features

  • Natural Language Understanding: Translates user-provided natural language queries into accurate SQL queries, eliminating the need for SQL expertise. -Data Retrieval and Analysis: Retrieves results from the database and interprets them using an LLM (GPT 3.5) to offer meaningful insights, empowering users to make data-driven decisions.
  • User-Friendly Interface: With Streamlit, the application offers a clean and intuitive user interface that includes "Streamlit pills" for executing sample queries. This feature significantly enhances user experience by providing a straightforward mechanism for data interaction.

Image description

Tools and Technologies

  • LLM: OpenAI's GPT-3.5, known for its advanced language understanding and generation capabilities.
  • LLM Orchestration: LlamaIndex (version 0.10.26), a robust framework for managing and interacting with LLMs.
  • Data Management: SQL Database with SQLite, providing a reliable and efficient data storage solution.
  • UI Framework: Streamlit, enabling interactive and user-friendly web application creation.

Why LlamaIndex over LangChain?

While LlamaIndex and LangChain offer valuable tools for LLM-based applications, we opted for LlamaIndex for its unique strengths in this Text-to-SQL project. LlamaIndex provides a more streamlined and integrated approach to building and managing LLM-powered applications. Its modular design allows for a flexible composition of components like LLMs, vector stores, and data connectors. Additionally, LlamaIndex's focus on data indexing and retrieval aligns perfectly with the core functionality of our Text-to-SQL application.

Deep Dive into the Code

The app.py script is a crucial component of the Text-to-SQL application, serving as the primary gateway through which users interact with the tool. Let's explore some of the critical functions and how they contribute to the application's functionality.

1. Database Connection and Table Display

The application utilizes SQLAlchemy to manage connections to a SQLite database, allowing it to perform operations more securely and efficiently. It uses this connection to fetch and display data based on user queries.

def get_table_data(table_name, conn):
    query = f"SELECT * FROM {table_name}"
    df = pd.read_sql_query(query, conn)
    return df
Enter fullscreen mode Exit fullscreen mode

This function retrieves all data from a specified table and displays it in the Streamlit sidebar. It demonstrates how the application interacts with the database at a fundamental level.

Image description

2. Loading the Database and Language Model

The function load_db_llm() initializes the database connection and sets up the OpenAI language model for generating SQL queries. It exemplifies how various components, like the database and language model, are integrated.

def load_db_llm():
    engine = create_engine("sqlite:///ecommerce_platform1.db?mode=ro", connect_args={"uri": True})
    sql_database = SQLDatabase(engine)  # include all tables
    Settings.llm = OpenAI(temperature=0.1, model="gpt-3.5-turbo-1106")
    return sql_database, Settings, engine
Enter fullscreen mode Exit fullscreen mode

This setup ensures the application can handle SQL queries securely and performantly while leveraging an advanced AI model to translate natural language into SQL.

3. Handling User Queries and Responses

The core interaction mechanism of the app is handled by the chat functionality, which captures user input and processes it through OpenAI’s GPT 3.5 via Llamaindex’s NLSQLTableQueryEngine. The query engine first converts the natural language user query to SQL. It displays the SQL query and then also executes it against the database. The SQL results are then passed along to the LLM and the user query, generating the final response.

if prompt := st.chat_input("Enter your natural language query about the database"):
    with st.chat_message("user"):
        st.write(prompt)
    add_to_message_history("user", prompt)

    # If last message is not from assistant, generate a new response
    if st.session_state["messages"][-1]["role"] != "assistant":
        with st.spinner():
            with st.chat_message("assistant"):
                response = st.session_state["query_engine"].query("User Question:"+prompt+". ")
                sql_query = f"```

sql\n{response.metadata['sql_query']}\n

```\n**Response:**\n{response.response}\n"
                response_container = st.empty()
                response_container.write(sql_query)
                add_to_message_history("assistant", sql_query)
Enter fullscreen mode Exit fullscreen mode

Enhancing the User Experience with Streamlit Pills

A notable UI enhancement in the application is the introduction of Streamlit pills, which simplify the interaction by allowing users to select example queries quickly.

query_options = ["None", "Show data for '4K LED Smart TV'", "Show all books and their reviews", "Analyse reviews for Electronics"]
selected_query = pills("Select example queries or enter your own query in the chat input below", query_options, key="query_pills")
Enter fullscreen mode Exit fullscreen mode

This functionality makes the application more user-friendly and demonstrates how complex functionalities can be encapsulated in simple, intuitive UI elements.

Image description

Conclusion

The app.py script effectively integrates multiple advanced technologies to create a seamless and powerful user experience. By leveraging SQL databases, large language models, and an intuitive Streamlit interface, the Text-to-SQL application is a robust example of modern software engineering and AI capabilities. This deep dive into the codebase highlights how each component plays a vital role in the application's functionality, showcasing the intricate balance between backend logic and frontend design.

Top comments (0)