Femi-ige Muyiwa for Hackmamba

Posted on Feb 21, 2024

Building AI-powered mobile app with RAG, pgvector & OpenAI

#openai #tutorial #mobile #flutter

Retrieval augmented generation (RAG) is an AI technique for enhancing the response capabilities of large language models (LLMs) by providing them with an external source of knowledge. This framework ensures that the model can access current and reliable facts because the users can access the model sources.

In this tutorial, you’ll learn how to implement the RAG technique by building a mobile application that gives a detailed response with respect to an external data source (CSV file) when queried. We’ll also cover how to implement this functionality with Neon Postgres, Flutter framework, and OpenAI.

Check out the link to the GitHub repository containing all the code.

As you begin, this tutorial will explore the following:

Creating a Neon Postgres project, database, and table.
Component of RAG
Building a Flutter application
Connecting Neon to the Flutter application

Prerequisites

We’ll need a few things before getting started:

A good understanding of the Flutter framework and dart programming.
A Neon account.
An OpenAI API key; thus, you need an OpenAI account.

Breaking down the technologies

Let’s talk a bit about the tools we’ll be using. Neon is a fully managed serverless Postgres with a generous free tier that provides separate storage and computing to offer autoscaling, branching, and bottomless storage. Neon is fully open source under the Apache 2.0 licenses, and you can find the neondatabase on GitHub.

We’ll also be using the pgvector Postgres extension on Neon. pgVector is a Postgres extension that works with vector embeddings for storage, similarity search, and more.

Enabling the pgvector extension in your database simplifies storing vector embeddings as well as easy querying using the inner product (<#>) or cosine distance (<=>).

Note: vector embeddings are numerical representations of data that capture semantic relationships and similarities

The project further explores the use case of the RAG technique by employing the Flutter framework. Flutter is a UI development toolkit developed by Google that supports cross-platform app-building using the same codebase. The use of the Flutter application for this project is to give a programmatic and visual understanding of how the RAG technique works. Thus, we will build a simple chatbot that offers a response based on the external data source provided at a time.

Finally, the project comes to life using both technologies by pairing with a Large Language Model (LLM) via OpenAI to process the external data source and deliver a response in natural language.

Some paragraphs above mentioned a mystery phrase called vector embeddings, leading to more questions. The coming sections will demystify this concept by giving a better understanding of the things needed to implement the RAG technique in your application.

Let's begin without further ado.

Setting up a Neon Postgres project

Neon offers a simple method of setting up a project — streamlined to three things:

Project name
Postgres version
Database name

This option is made available after signing into your Neon account. On the project setup page, there are more options to explore; for example, you can change the branch name to any other name, but advisably, leave it as main for now and click Create project.

Now, on the homepage, you’ll find a popup containing all the connection details required to connect your application(s) to a Neon database. Copy these details to a safe file, such as a JSON file.

Note: For security reasons, you should use a .env file to secure your connection details. Check out this article to learn more about how to use flutter_dotenv.

With Neon, you can manage your database using a few options, such as a Command Line Interface (CLI), the Neon API, or a Structured Query Language (SQL) tool. Neon also provided an SQL Editor to write SQL commands directly on the console. You are in luck with SQL commands, as Flutter supports a Dart package Postgres that makes this possible.

The coming sections will shed more light on using Flutter and Neon Postgres pgvector extension and OpenAI to implement RAG.

Creating the Flutter application

Cloning the UI template

The project implements a Flutter template for faster development by making the UI code available using GitHub. To begin, clone the template from the repository using the command below:



git clone https://github.com/muyiwexy/neon_rag_with_openai.git

Note: The template contains the Flutter UI code and some defragmented codes, which will cause some debug errors.

After cloning the project, run the command below to obtain all the dependencies listed in the pubspec.yaml file in the current working directory and their transitive dependencies.:



flutter pub get

We employed the Model View Controller (MVC) architecture to handle specific development aspects of the application and give it a clear separation of concern. The architecture helps maintain readability by separating the business (core) logic from the UI or view (presentation layer). The cloned template won't compile at the moment as some things must be implemented as you progress.

Here’s a helpful ASCII representation of the main Flutter development folder (lib) using the MVC design pattern:



lib/
├─ home/
│  ├─ view/
│  │  ├─ home_page.dart
├─ core/
│  │  ├─ config/
│  │  │ ├─ service_config.dart
│  │  ├─ dependency_injection/
│  │  │ ├─ provider_locator.dart
├─ main.dart

Notes: The intended data source is a comma-separated values (CSV) file called mydata.csv in the assets folder.

The external dart plugins used during the project (they are in the template pubspec.yaml file) :



dependencies:
  csv
  flutter_dotenv
  http
  postgres
  provider

Implementing the RAG technique

Implementing a Flutter project as a RAG application involves the functionalities being classed into two components:

Indexing
Retrieval

Indexing

Indexing allows the system to receive information from an extensive data set by creating means of data collection, encoding, and storage. To successfully index the intended data (mydata.csv), you will need to approach the indexing component in three processes:

Load
Split
Store

Load
The load process is usually offline; thus, with the CSV package, you can convert a CSV file to a list of Strings with the convert method of the CsvToListConverter() class.

To load a CSV file, head to the lib/home directory and create a new controller directory. Within this newly created directory, create a file called openai_indexing_services.dart to handle the abstraction of the business logic from the UI.

The openai_indexing_services.dart will handle all the abstractions for the methods in this project, leading to some back and forth in the coming sections. For the first method, create a function loadCSV() to handle loading the CSV file in this project.



abstract class OpenAIIndexingServices {
  Future<List<List<dynamic>>> loadCSV();
}

Still, within the lib/home/controller directory, create another file called openai_indexing_services_impl.dart to handle the implementation of the abstraction OpenAIIndexingServices above. It does this using inheritance — one of the fundamentals of object-oriented programming (OOP).

The implementation of the loadCSV() function uses the loadString method in the rootbundle property to get the CSV file from its root directory. Then, the convert method in the CsvToListConverter() class (called from the CSV package) converts the CSV document to a list of lists (with each inner list representing a row of the CSV documents) and then returns the value.



class OpenAIIndexingServicesImpl extends OpenAIIndexingServices {

  @override
  Future<List<List<dynamic>>> loadCSV() async {
    debugPrint("Loading File... ");
    final rawcsvFile = await rootBundle.loadString("assets/mydata.csv");
    final csvData =
        const CsvToListConverter().convert(rawcsvFile.splitMapJoin("/n"));
    return csvData;
  }
}
void debugPrint(String message) {
  if (kDebugMode) {
    print(message);
  }
}

Split
Next, you’ll need to split this data into smaller chunks and embed them as vectors to aid the indexing and passing of large data into models, as large data takes too long to search over.

To split and embed the loaded data into chunks, update the abstraction in the lib/home/controller/openai_indexing_services.dart with the code below:



abstract class OpenAIIndexingServices {
  Future<List<List<dynamic>>> loadCSV();
  Future<List<List<String>>> splitToChunks(List<List<dynamic>> csvDoc);
  Future<List<List<double>>> getEmbeddings(List<List<String>> chunks);
}

The splitToChunks method takes in the value returned from the loadCSV (csvDoc) as an argument and returns a list of lists of Strings.

The implementation of the splitToChunks method in the lib/home/controller/openai_indexing_services_impl.dart takes a list of lists (csvDoc) as a parameter, where each inner list in the csvDoc represents a row in a CSV document.

Then, it initializes an empty list chunkList to store chunks of split data ( for later) and iterates over each row in csvDoc using a for loop, starting from the second row (var i = 1), treating each row as a document.

Within the iteration or for loop, specify the index of the content intended to be split, which then gets the content string for every row. After, obtain the ideal size of a chunk through the floor division of a set max idealTokenSize (512) by 1.3 and explicitly convert the subsequent result to an integer.

Next, split the content string gotten earlier into words using a space delimiter (""), filter a list of words, remove occurrences of the delimiter, and assign the resulting list back to the variable. Then, it gets the length of the new list using the length method, which is now the total number of words in the content.

After, it determines the number of chunks needed based on a floor division of the total words by the ideal size of a chunk. The document is split into ideal-sized chunks, and each chunk is added to the chunkList with an ID that combines the original document's ID and the chunk number.

Finally, the splitToChunks method returns the chunkList. The splitToChunks method ensures that each chunk is approximately of the ideal size, but the actual size can vary depending on the word boundaries.



class OpenAIIndexingServicesImpl extends OpenAIIndexingServices {

  // loadCSV

  @override
  Future<List<List<String>>> splitToChunks(List<List<dynamic>> csvDoc) async {
    List<List<String>> chunkList = [];
    for (var i = 1; i < csvDoc.length; i++) {
      String content = csvDoc\[i\][1];
      int start = 0;
      int idealTokenSize = 512;
      int idealSize = (idealTokenSize ~/ (4 / 3)).toInt();
      int end = idealSize;
      List<String> words = content.split(" ");
      words = words.where((word) => word != " ").toList();
      int totalWords = words.length;
      int chunks = totalWords ~/ idealSize;
      if (totalWords % idealSize != 0) {
        chunks += 1;
      }
      List<String> newContent = [];
      for (int j = 0; j < chunks; j++) {
        if (end > totalWords) {
          end = totalWords;
        }
        newContent = words.sublist(start, end);
        String newContentString = newContent.join(" ");
        String id = csvDoc\[i\][0];
        chunkList.add(["${id}_$j", newContentString]);
        start += idealSize;
        end += idealSize;
      }
    }
    return chunkList;
  }
}
void debugPrint(String message) {
  if (kDebugMode) {
    print(message);
  }
}

To embed the chunkList, implement the getEmbeddings method and pass the chunkList as an input. Here is the code below:



class OpenAIIndexingServicesImpl extends OpenAIIndexingServices {
  late final http.Client client;
  OpenAIIndexingServicesImpl({
    required this.client,
  });

  // loadCSV

  // splitToChunks

  @override
  Future<List<List<double>>> getEmbeddings(List<List<String>> chunks) async {
    debugPrint("Embedding ....");
    List<List<double>> embeddedDoc = [];
    for (var chunk in chunks) {
      final response = await client.post(
        Uri.parse("https://api.openai.com/v1/embeddings"),
        headers: {
          'Content-Type': 'application/json',
          'Authorization': 'Bearer ${dotenv.env['OPENAI_API_KEY']!}',
        },
        body: jsonEncode({
          'model': 'text-embedding-ada-002',
          'input': chunk[0],
        }),
      );
      if (response.statusCode == 200) {
        final Map<String, dynamic> responseData = jsonDecode(response.body);
        List<dynamic> embedding = responseData\['data'\][0]['embedding'];
        List<double> embeddingdouble = embedding.map((item) {
          if (item is double) {
            return item;
          } else {
            throw const FormatException('Invalid format');
          }
        }).toList();
        embeddedDoc.add(embeddingdouble);
      } else {
        throw response.body;
      }
    }
    debugPrint("Embedding complete....");
    return embeddedDoc;
  }
}
void debugPrint(String message) {
  if (kDebugMode) {
    print(message);
  }
}

The getEmbeddings method takes the chunkList list — a list of strings as an input. It will iterate over every list item within the chunkList list and perform a specific action to create embeddings for them.

To achieve this, start by initializing an empty list of lists of doubles embeddedDoc to store the embeddings. Then, use a for loop to iterate over each list item in the chunkList list while performing the following actions on each iteration:

Using the http package, make a POST request to the OpenAI API to get the embedding of a list item from the chunkList list.
If the response status is 200, decode the response body to get the embedding, convert it to a list of doubles, and add it to embeddedDoc.
If the response status is not 200, throw an exception with the response body as the message.

Finally, let the getEmbeddings method return embeddedDoc, a list of embeddings where each embedding is a list of doubles.

Store
Now, you have a couple of data (chunkList and embeddedDoc) that need storing in someplace, and that is where your Neon database comes in. To store your embedding (embeddedDoc) and chunks (chunkList), you must create a database table and enable the pgvector's vector extension in the database. The vector extension is handy as it lets you search using a vector cosine's similarity index when querying the database later.

Thus, start by updating the OpenAIIndexingServices abstraction in the lib/home/controller/openai_indexing_services.dart file with the code below:



abstract classOpenAIIndexingServices {
  // the abstraction above
  Future<bool> checkExtExist();
  Future<bool> checkTableExist(String tableName);
  Future<String> createNeonVecorExt();
  Future<String> createNeonTable(String tableName);
  Future<String> deleteNeonTableRows(String tableName);
  Future<void> storeDoument(List<List<String>> chunks,
      List<List<double>> embeddedVectors, String tableName);
}

Remember, Neon allows database management using SQL commands — equally, you can execute these SQL commands programmatically in Flutter using the Postgres package.

Thus, for the implementation in the lib/home/controller/openai_indexing_services_impl.dart, the checkExtExist, and checkTableExist methods are condition checkers that use the SELECT statement to confirm if the vector extension and a particular table exist within the database. They return boolean values dependent on the result obtained from the request.

Next, use the createNeonVecorExt method to create a 'vector' extension in the Neon Postgres database. Also, the createNeonTable method creates a specified table in the Neon Postgres database with id, metadata, and embedding fields. It also creates an index on the embedding field using the ivfflat algorithm and vector_cosine_ops operator class. The deleteNeonTableRows method truncates all rows from a specified table in the Neon Postgre database.

Finally, the storeDoument method stores the data in the Neon database table created earlier by iterating over the chunkList and embeddedDoc, inserting the id, metadata, and embedding into the table.

The metadata field is a map containing pageContent and txtPath keys, which are the IDs in the chunkList, while the embedding field is a string representation of an array of doubles.



class OpenAIIndexingServicesImpl extends OpenAIIndexingServices {
  late final http.Client client;
  late final Connection connection;
  OpenAIIndexingServicesImpl({
    required this.client,
    required this.connection,
  });

  // loadCSV

  // splittoChunks

  // getEmbeddings

  @override
  Future<bool> checkExtExist() async {
    final checkExtExist = await connection.execute(
      "SELECT EXISTS (SELECT FROM pg_extension WHERE extname = 'vector');",
    );
    return checkExtExist.first[0] as bool;
  }

  @override
  Future<bool> checkTableExist(String tableName) async {
    final checkTableExist = await connection.execute(
      "SELECT EXISTS (SELECT FROM information_schema.tables WHERE  table_schema = 'public' AND table_name = '$tableName');",
    );
    return checkTableExist.first[0] as bool;
  }

  @override
  Future<String> createNeonVecorExt() async {
    debugPrint("Creating pgvector extension ...");
    await connection.execute("CREATE EXTENSION vector;");
    return "Vector extension created Successfully";
  }

  @override
  Future<String> createNeonTable(String tableName) async {
    debugPrint("Creating the $tableName table ... ");
    await connection.execute(
      "CREATE TABLE $tableName (id text, metadata text, embedding vector(1536));",
    );
    debugPrint("Indexing the $tableName using the ivfflat vector cosine");
    await connection.execute(
        'CREATE INDEX ON $tableName USING ivfflat (embedding vector_cosine_ops) WITH (lists = 24);');
    return "Table created successfully";
  }
  @override
  Future<String> deleteNeonTableRows(String tableName) async {
    debugPrint("Deleting tableRows");
    await connection.execute("TRUNCATE $tableName;");
    return "Table rows deleted successfuly";
  }

  @override
  Future<void> storeDoument(
    List<List<String>> chunks,
    List<List<double>> embeddedVectors,
    String tableName,
  ) async {
    debugPrint("Storing data... ");
    await connection.runTx((s) async {
      for (int i = 0; i < chunks.length; i++) {
        final chunk = chunks[i];
        final embeddingArray = embeddedVectors[i];
        await s.execute(
            Sql.named(
              'INSERT INTO $tableName (id, metadata, embedding) VALUES (@id, @metadata, @embedding)',
            ),
            parameters: {
              'id': chunk[0],
              'metadata': {
                'pageContent': chunk[1],
                'txtPath': chunk[0],
              },
              'embedding': '$embeddingArray',
            });
      }
    });
    debugPrint("Data stored!!!");
  }
}
void debugPrint(String message) {
  if (kDebugMode) {
    print(message);
  }
}

With this, you have successfully indexed the data; the next step is to retrieve it!

Retrieval

Retrieval is a streamlined component commonly divided into two processes:

Retrieve: This is done by comparing the vector embedding of a user query with the closest available result present in the database. We perform this comparison using the cosine similarity search to compare a vector with another. Thus, when you get the closest results, you can use it for the second process.
Generate: After getting the closest result, you can use it as an LLM assistance to generate responses based on that information.

To do this, head to the OpenAIIndexingServices abstract class in the lib/home/controller/openai_indexing_services.dart and update it with the code below:



abstract class OpenAIIndexingServices {
  // do something
  Future<String> queryNeonTable(String tableName, String query);
}

The method above returns a string response by following the retrieval process above.

Here is the code for the implementation of the queryNeonTable method below:

openai_indexing_service_impl.dart

For the OpenAIIndexingServicesImpl class in the code above, you must define a method getCompletionFromMessages that:

Makes a POST request to the OpenAI API to get a completion from a list of messages.
If the response status is 200, it decodes the response body to get the completed content.
If the response status is not 200, it throws an exception with the response body as the message.

Next, define a method getQueryEmbeddings that:

Makes a POST request to the OpenAI API to get the embedding of a query.
If the response status is 200, it decodes the response body to get the embedding, converts it to a list of doubles, and returns it.
If the response status is not 200, it throws an exception with the response body as the message.

Finally, use the queryNeonTable method to get the embedding of a query by calling the getQueryEmbeddings method. Afterward, execute an SQL query on the connection using Postgres to get similar items from a specified table. After getting a list of similar items from the table, convert the result into a list of Metadata objects.

Add a condition to check if Metadata is not empty, and if valid, concatenate the page content, make a list of messages, get a completion from the messages, and return it. If Metadata is empty, it returns a default message: "Couldn't find anything on that topic."

Here is the Metadata objects model class below:



class Metadata {
  String pageContent;
  String txtPath;
  Metadata({
    required this.pageContent,
    required this.txtPath,
  });
  factory Metadata.fromJson(Map<String, dynamic> json) => Metadata(
        pageContent: json["pageContent"],
        txtPath: json["txtPath"],
      );
  Map<String, dynamic> toJson() => {
        "pageContent": pageContent,
        "txtPath": txtPath,
      };
}

Note: Save it in the lib/home/model/ directory

Handling the abstraction, state management, and dependency injection

Though you have created and implemented functionalities that encapsulate the idea of this project, it is still all business logic. This means there is no connection between the business logic and the presentation layer (UI). To implement this connection, there will need to be a "handshake" between these layers, and you can do this by creating a class that will have direct contact with the presentation layer.

To begin, create a new directory called view_model in the lib/home/ folder, this folder will have two files to handle indexing and retrieval, respectively. Create a file in the lib/home/view_model/ directory called index_notifier.dart. This file will cater to the indexing processes established earlier and the application's state involved in the process.

The project will explore a simple management solution called ValueNotifiersfor state management. This is mainly down to the fact that you aren't dealing with a complex.
UI and might be listening to single values throughout the project.

So, start by creating an enum class within the lib/home/view_model/index_notifier.dart file with initial, loading, loaded, and error values to track the application’s state. Then, create an IndexNotifier class and instantiate the OpenAIIndexingServices object from the lib/home/controller/openai_indexing_services.dart. Make sure to pass in the OpenAIIndexingServices object to the constructor.

Next, define a ValueNotifier, for indexState to track the state of the indexing process, followed by creating a method indexingNeonTable. The indexingNeonTable does the following:

Sets indexState to loading.
Loads a CSV document using openAIIndexingServices.
Splits the document into chunks and gets their embeddings.
Check if the 'vector' extension exists in the database and create it if not.
Check if a specified table exists in the database and create it if not. If the table already exists, it deletes all rows from the table.
Stores the chunks and their embeddings in the table.
Sets indexState to loaded.
Handles errors by setting indexState to error and rethrowing the exception.
After a delay, set indexState to initial.

Here is the code in the index_notifier.dart below:



enum IndexState { initial, loading, loaded, error }

class IndexNotifier {
  late final OpenAIIndexingServices openAIIndexingServices;
  IndexNotifier({required this.openAIIndexingServices});

  final _indexState = ValueNotifier<IndexState>(IndexState.initial);
  ValueNotifier<IndexState> get indexState => _indexState;

  indexingNeonTable() async {
    try {
      _indexState.value = IndexState.loading;
      final csvDoc = await openAIIndexingServices.loadCSV();
      final chunks = await openAIIndexingServices.splitToChunks(csvDoc);
      final embeddedDocs = await openAIIndexingServices.getEmbeddings(chunks);

      if (!(await openAIIndexingServices.checkExtExist())) {
        await openAIIndexingServices.createNeonVecorExt();
      }
      if (!(await openAIIndexingServices
          .checkTableExist(ServiceConfigurations.openAITable))) {
        await openAIIndexingServices
            .createNeonTable(ServiceConfigurations.openAITable);
      } else {
        await openAIIndexingServices
            .deleteNeonTableRows(ServiceConfigurations.openAITable);
      }

      await openAIIndexingServices.storeDoument(
        chunks,
        embeddedDocs,
        ServiceConfigurations.openAITable,
      );
      _indexState.value = IndexState.loaded;
    } catch (e) {
      _indexState.value = IndexState.error;
      rethrow;
    } finally {
      await Future.delayed(const Duration(milliseconds: 2000));
      _indexState.value = IndexState.initial;
    }
  }
}

Create another file within the lib/home/view_model called query_notifier.dartfor the second file. This file does similarly to the index_notifier.dart file.

Begin by creating a Message model class with String values query and response and pass them into the constructor. Also, create an enum class within the file with values initial, loading, loaded, error to track the state of the application when a user queries the LLM.

Create a QueryNotifier class that initializes the OpenAIIndexingServices object from the lib/home/controller/openai_indexing_services.dart, followed by defining a ValueNotifier for _queryandResponseState to track the state of the query and response.

After, define a ValueNotifier for queryState to track the state of the query. Follow this by creating a method queryIngeonTable that does the following:

Adds a new Message to _messages.
Sets queryState to loading.
Calls queryNeonTable method of openAIIndexingServices to get a response.
Updates the last message's response and sets queryState to loaded.
Handles errors by setting queryState to error, then back to initial after a delay.



class QueryandResponse {
  String? query;
  String? response;
  QueryandResponse({required this.query, this.response = ""});
}
enum QueryState {
  initial,
  loading,
  loaded,
  error,
}
class QueryNotifier {
  late final OpenAIIndexingServices openAIIndexingServices;
  QueryNotifier({required this.openAIIndexingServices});
  final List<QueryandResponse> _queryandResponse = [];
  final _queryandResponseState = ValueNotifier<List<QueryandResponse>>([]);
  ValueNotifier<List<QueryandResponse>> get queryandResponseState =>
      _queryandResponseState;

  final _queryState = ValueNotifier<QueryState>(QueryState.initial);
  ValueNotifier<QueryState> get queryState => _queryState;

  queryIngeonTable(String query) async {
    try {
      _queryandResponse.add(QueryandResponse(query: query, response: ""));

      _queryandResponseState.value = List.from(_queryandResponse);
      _queryState.value = QueryState.loading;

      String response = await openAIIndexingServices.queryNeonTable(
          ServiceConfigurations.openAITable, query);

      final List<QueryandResponse> updatedMessages =
          List.from(_queryandResponse);
      updatedMessages.last.response = response;

      _queryandResponseState.value = updatedMessages;
    } catch (e) {
      // Handle errors if necessary
      print(e);
      _queryState.value = QueryState.error;
      await Future.delayed(const Duration(milliseconds: 2000));
      _queryState.value = QueryState.initial;
    }
  }
}

With that, a "handshake" is possible but not yet established, as the Flutter application is unaware of a few things, such as where/what the Postgres connection is established. This is because the Postgres connection is a required field of the OpenAIIndexingServicesImpl class in the lib/home/controller/openai_indexing_services_impl.dart file.

Thus, the next section will clarify how to mount dependencies on the widget tree and connect Flutter to the neon database.

Note: Check out the lib/core/dependency_injection/provider_locator.dart file to see how to inject the objects as a dependency using the provider package.

Connecting Neon database to Flutter application and dependency injection
You used the Postgres package to execute SQL commands during this project. The Postgres package also connects the Flutter app to the Neon database to facilitate the execution of those commands declared in your code. The code containing the connection process is in the provider_locator.dart file from the template.

Here is the code below:



class ProviderLocator {
  // provider tree
  static Future<MultiProvider> getProvider(Widget child) async {
    final openAIIndexingService = await _createOpenAIIndexingServices();
    return MultiProvider(
      providers: [
        Provider<OpenAIIndexingServices>.value(value: openAIIndexingService),
        Provider(
          create: (_) => IndexNotifier(
            openAIIndexingServices: openAIIndexingService,
          ),
        ),
        Provider(
            create: (_) =>
                QueryNotifier(openAIIndexingServices: openAIIndexingService))
      ],
      child: child,
    );
  }
  static Future<OpenAIIndexingServices> _createOpenAIIndexingServices() async {
    final connection = await createPostgresConnection();
    final client = await _createHtttpClient();
    return OpenAIIndexingServicesImpl(connection: connection, client: client);
  }
  // postgres connection
  static Future<Connection> createPostgresConnection() async {
    const maxRetries = 3;
    for (var retry = 0; retry < maxRetries; retry++) {
      try {
        final endpoint = Endpoint(
          host: dotenv.env['PGHOST']!,
          database: dotenv.env['PGDATABASE']!,
          port: 5432,
          username: dotenv.env['PGUSER']!,
          password: dotenv.env['PGPASSWORD']!,
        );
        final connection = await Connection.open(
          endpoint,
          settings: ConnectionSettings(
            sslMode: SslMode.verifyFull,
            connectTimeout: const Duration(milliseconds: 20000),
          ),
        );
        if (connection.isOpen) {
          if (kDebugMode) {
            print("Connection Established!");
          }
          return connection;
        }
      } catch (e) {
        if (kDebugMode) {
          print('Error creating PostgreSQL connection: $e');
        }
      }
      await Future.delayed(const Duration(seconds: 2));
    }
    // If maxRetries is reached and the connection is still not open, throw an exception
    throw Exception(
        'Failed to establish a PostgreSQL connection after $maxRetries retries');
  }
  static Future<http.Client> _createHtttpClient() async {
    try {
      return http.Client();
    } catch (e) {
      rethrow;
    }
  }
}

In the code above, the ProviderLocator class in the code does the following:

Defines a method getProvider that:
- Creates an OpenAIIndexingServices instance.
- Returns a MultiProvider with an OpenAIIndexingServices provider and two other providers for IndexNotifier and QueryNotifier.
Defines a method _createOpenAIIndexingServices that:
- Creates a PostgreSQL connection using the connection details from the Neon database.
- Creates an HTTP client.
- Returns an OpenAIIndexingServicesImpl instance with the created connection and client.
Defines a method createPostgresConnection that:
- Tries to establish a PostgreSQL connection with specified settings.
- If the connection fails, it retries up to a maximum number of times.
- If the connection is not established after maximum retries, it throws an exception.
Defines a method _createHtttpClient that returns an HTTP client.

With that, you have successfully mounted the database on the widget tree and established a connection between the Neon Postgres database and Flutter. When you run the application, you should have your results as shown in the video below:

rag with openai | Opentape

Muyiwa Femi-Ige - Feb 21st, 10:32am

app.opentape.io

Conclusion

This tutorial journeyed into the world of RAG and its implementation in mobile applications using the Flutter framework, Neon Postgres, and OpenAI. It explored how RAG enhances the response capabilities of LLMs by providing them with an external knowledge source.

You also learned about Neon and its capabilities with the pgvector extension. If you’ve gotten this far, you’re now equipped with the knowledge to build more intelligent and responsive applications using RAG, opening up new possibilities in AI.

If this piqued your curiosity, let's contribute to advancing AI, making it more accessible, efficient, and beneficial for all. Happy coding!

Resources

Here are some resources that will guide you more in this journey:

Top comments (1)

Pro Seo • Nov 27 '25 • Edited

This tutorial clearly explains how RAG enhances LLM responses by combining external data sources with technologies like Neon Postgres, pgvector, and Flutter—an approach similar to retrieval-based AI methods described on Wikipedia for improving factual accuracy. It shows how vector embeddings, serverless Postgres, and a Flutter-built chatbot work together to deliver precise, data-grounded outputs. Users exploring advanced mobile customization tools, such as magisk, may find this structured integration helpful in understanding how modular systems expand app capabilities.