DEV Community

Cover image for From Local AI to Enterprise-grade deployment with BionicGPT
raphiki for Technology at Worldline

Posted on • Updated on

From Local AI to Enterprise-grade deployment with BionicGPT

Welcome to the second article of my Chat with Your Content series. Today, we delve into BionicGPT, a project that could easily feature in the Bringing AI Home series due to its ability for local deployment on desktop systems or hosting in the Cloud.

Introducing BionicGPT

BionicGPT, a brand new project visible on GitHub, is an open-source venture, licensed under Apache 2 and MIT. An enterprise backs it, offering support and consulting services.

BionicGPT Logo

Developed in Rust, BionicGPT promises safety and performance. Despite being relatively new, its robust architecture and industrial-grade features stand out. It builds upon the Rust on Nails framework, initially created at Airbus, which offers scalability via Kubernetes and embraces an infrastructure-as-code approach.

Below is a glimpse of BionicGPT's architecture, adapted from Rust on Nails, as seen on their official website:

BionicGPT Architecture

The system integrates multiple open source components within Docker containers, forming a Large Language Model pipeline, complete with a custom user interface. It utilizes LocalAI for local LLM deployments and PgVector for vector storage.

Setting Up on My Laptop

BionicGPT's documentation indicates compatibility with 16GB RAM laptops, matching my setup. The installation process, streamlined through Docker, involves downloading a compose file and running it.



curl -O https://raw.githubusercontent.com/purton-tech/bionicgpt/main/docker-compose.yml
docker compose up


Enter fullscreen mode Exit fullscreen mode

This step might take some time as it downloads necessary Docker images, including a quantized LLaMA 2 7B default model.



➜  bionic-gpt docker compose up
[+] Running 9/0
 ✔ Container bionic-gpt-db-1              Created
 ✔ Container bionic-gpt-embeddings-api-1  Created
 ✔ Container bionic-gpt-unstructured-1    Created
 ✔ Container bionic-gpt-envoy-1           Created
 ✔ Container bionic-gpt-llm-api-1         Created
 ✔ Container bionic-gpt-migrations-1      Created
 ✔ Container bionic-gpt-barricade-1       Created
 ✔ Container bionic-gpt-embeddings-job-1  Created
 ✔ Container bionic-gpt-app-1             Created


Enter fullscreen mode Exit fullscreen mode

After the setup, accessing the Web console at http://localhost:7800 allows for the creation of an admin user.

Navigating the UI

Upon immediate use, response times may vary based on your computer's specifications.

First Chat

With that first test passed, let's step back and understand some key elements of BionicGPT.

The UI allows users to be organized into Teams with varying permissions - System Administrator, Team Administrator, and Team Collaborator.

Team

Within Teams, users can craft prompts, link them to models with custom settings, and associate them with datasets. This facilitates the Retrieval-augmented generation (RAG) by indexing uploaded documents, converting them into vectors with the BGE Small EN v1.5 default embedding model, and storing them in PostgreSQL/PgVector.

Dataset

I tested this by uploading documents to the TechSquad dataset and creating a prompt named Chat with docs. The resulting contextualized answers in the chat console were impressive.

RAG in action

Leveraging API Endpoints

BionicGPT also enables the creation of API endpoints, which can be used in applications like chatbots. These endpoints are linked to specific prompts and require API keys for access.

API Keys

Using CURL, I tested the Chat with docs prompt:



curl http://localhost:7800/v1/chat/completions  \
  -H "Content-Type: application/json"  \
  -H "Authorization: Bearer <API-key-here>" \
  -d '{ "model": "llama-2", "messages": [{"role": "user", "content": "What is the TechSquad?"}] }' 


Enter fullscreen mode Exit fullscreen mode


{
    "id": "cmpl-25c0d2f4-ab74-47f8-a76c-7d9319658e1a",
    "object": "chat.completion",
    "model": "Llama-2",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": " The TechSquad is an initiative at 
Worldline that aims to empower tech experts within the company to 
voice their expertise and collaborate with other teams. It is 
composed of a core team and seven working groups called squads, 
each focused on a specific area of technology. The TechSquad 
initiative provides various channels and clubs for employees to 
share their expertise, contribute to the company's knowledge base, 
and build relationships within the organization. Its goal is to 
promote coherence, communication, and alignment within Worldline, 
while fostering innovation and supporting business functions."
            },
            "finish_reason": "Length"
        }
    ],
    "usage": {
        "prompt_tokens": 0,
        "completion_tokens": 0,
        "total_tokens": 0
    }
}


Enter fullscreen mode Exit fullscreen mode

The design choice to require a model parameter, despite the presence of the API key, aligns with OpenAI specifications, allowing for seamless integration with tools compatible or designed for OpenAI LLMs, like like Flowise or LibreChat for instance.

I further experimented with the lightweight ChatGPT Lite frontend, configuring it to interact with BionicGPT.



git clone https://github.com/blrchen/chatgpt-lite.git
npm install
cp .env.example .env.local


Enter fullscreen mode Exit fullscreen mode

Content of the .env.local file:



OPENAI_API_KEY="<API-key-here>"
OPENAI_API_BASE_URL="http://localhost:7800"
OPENAI_MODEL="llama-2"


Enter fullscreen mode Exit fullscreen mode

Chat on http://localhost:3000:
ChatGPT Lite

Other models can be installed as soon as they are supported by LocalAI.

Conclusion

BionicGPT, still evolving, shows promise with upcoming features like model fine-tuning and S3 storage for documents. My initial tests on a laptop were successful, and the next steps before production should involve deploying on a Kubernetes infrastructure with added observability tools.

Top comments (0)