DEV Community

Tech at Golem.ai
Tech at Golem.ai

Posted on

RAG implementation test

Why are we doing this ?
Our system enables users to aggregate textual information to develop a data model for each project (the context), which works with our artificial intelligence technology. We believe that an approach based on artificial intelligence, specifically large language models (LLMs), could expedite and enhance this process. Consequently, our tool could suggest model enhancements based on its test sets and our documentation.

Objective
• Test the RAG implementation of LLM to increase knowledge about a trending approach.
• Verify the potential uses of this technology for Golem.ai.
• Test Mixtral

I won't delve into the details of how we improve the approach, what the risks are, etc. Understand that this is a response to the issue of hallucinations and accessing local data without retraining. This approach addresses the need to avoid having publicly available data in LLMs because the database can have user-specific rights without changing the LLM model. The main challenge will be related to the embedding phase, selecting the pieces of documents to provide to the final prompt, and consequently, the quality of the data you input into your vector database.

Presentation
To conduct this test, we were inspired by this tutorial as we wanted to test Ollama, an open-source library for deploying LLM on-premises with the Mixtral model.

The Qdrant database (among others) was recommended in the tutorial.

We conducted our tests on a basic Scaleway machine (0 GPUs but 64GB of RAM).

We added the content of 2 news articles to test the implementation.

Question 1 : "Who will produce the 40k series"

Response without assistance: "I'm sorry, I'm not up to date with the latest developments in the Warhammer 40,000 universe. Amazon announced in 2021 that they would produce a TV series based on Warhammer 40,000, but I don't have more recent detailed information on who will be responsible for the production. It would be best to check the latest news and updates from official sources or entertainment industry-focused media."

Response with our RAG: “Henry Cavill will serve as an executive producer for the Warhammer 40k series. The context does not provide information about other producers being involved in the project.”

Question 2 : “Give me the list of highways blocked by farmers”

Response without assistance: *“I don't have real-time access to information. However, according to news on March 18, 2023, farmers in France blocked several roadways, including some highways, to protest against the government's pension reform project. The affected highways are mainly in the West and North of France. Here is a non-exhaustive list of blocked or disrupted highways:_

A10 near Orléans (Loiret)

A13 near Rouen (Seine-Maritime)

A11 between Angers and Le Mans (Maine-et-Loire and Sarthe)

A84 between Caen and Rennes (Calvados and Ille-et-Vilaine)

A630/A63 in Bordeaux (Gironde)

These blockages can change rapidly, so it is recommended to check real-time traffic conditions on specialized websites or with local authorities.”_

Response with our RAG: *“The following highways are planned to be blocked by farmers between 14:00 and midnight:

A6 at Villabé (Essonne)

Toll booth Buchelay on the A13 (Yvelines)

RN14 at La Villeuneuve-Saint-Martin (Val-d’Oise)

A16 at Champagne-sur-Oise (Val-d’Oise)

Toll booth de Saint-Arnoult on the A10-A11 (Yvelines), starting from 15:00.

Additionally, a group of farmers has been blocking the A13 in the Eure since 10:00, planning to keep it blocked in both directions between the exits 16 (Douains) and 17 (Gaillon) until 17:30-22:00. They have not ruled out the possibility of blocking access to Paris depending on the government's announcements.”*

Conclusion
The processing times are not ideal for production deployment. We should perform this exercise on much more powerful machines. This issue is clearly not related to the RAG implementation but to the LLM itself.

Top comments (0)