DEV Community

Cover image for Open Source AI: Developing a Hugging Face ChatBot from scratch
raphiki for Technology at Worldline

Posted on • Edited on

Open Source AI: Developing a Hugging Face ChatBot from scratch

After explaining in my previous article how to create a ChatBot with LibreChat and VertexAI, and delving into the final part of my series on How open in Generative AI?, I feel compelled to share this concise tutorial on setting up a Chatbot using only open-source components, including the model.

Hugging Face is an ideal starting point when considering open source models.

Introducing Hugging Face

Hugging Face logo

Hugging Face is an open-source AI startup that focuses on developing and providing state-of-the-art natural language processing (NLP) models and APIs for various applications. Its primary goal is to make NLP more accessible and user-friendly for developers, researchers, and businesses by offering pre-trained models, libraries, and easy-to-use interfaces.

The organization shares several open source models and libraries, in addition to offering cloud-based API services through their Model Hub. This allows users to deploy and use pre-trained models without worrying about infrastructure or deployment issues.

It functions as a collaborative platform where the AI community can share and reuse models, datasets, and code, resembling the "GitHub for AI."

Let's start by creating a free account and obtaining an associated access token to use Hugging Face APIs.

Access token creation

Selecting the model

So let's select a model to use for our ChatBot by consulting the Model Hub...

Model Hub

When selecting a model for our Chatbot from the Model Hub... Well, there are almost 400,000 models to choose from! This is a testament to the dynamism of the AI community.

I chose Open-Assistant SFT-4 12B because it's a well-known fine-tuned model, based on a foundational model by EleutherAI, licensed under Apache 2.0, and I appreciate the crowdsourcing aspect of the mainstream project.

Open-Assistant SFT-4 12B

Accessing the Model with a simple API Call

Before integrating the Chatbot UI, we should ensure we can access the model with a simple Node.js API call. Hugging Face conveniently provides the code after selecting the Inference API in the Deploy menu.

Deploy menu

We choose the Javascript API, enable the Show API Token option, and copy the provided code.

Image description

async function query(data) {
    var response = await fetch("https://api-inference.huggingface.co/models/OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5",
        {
            headers: { Authorization: "Bearer <your-token-here>", "Content-type": "application/json" },
            method: "POST",
            body: JSON.stringify(data),
        }
    );
    var result = await response.json();
    return result;
}

var input = {"inputs": "What is Kriya Yoga?"};

query(input).then((response) => {
    console.log(JSON.stringify(response));
});
Enter fullscreen mode Exit fullscreen mode

I added the "Content-Type": "application/json" header to properly manage the response. Here's the result when executed by Node.js:

[{"generated_text":"What is Kriya Yoga?\n\nKriya Yoga is a spiritual practice that involves a set of techniques designed to help individuals"}]
Enter fullscreen mode Exit fullscreen mode

The API and the model responded to my query by completing the text I provided. Note that the response is truncated and may require multiple calls to complete, but it suffices for our test. Now, let's focus on the UI.

Chat UI

Hugging Face also offers the HuggingChat application, allowing anyone to interact with some of the community's models.

The source code is available under the Apache 2.0 license on GitHub. It's a Svelte application that also uses a MongoDB database to store chat history.

Let' install and configure it:

git clone https://github.com/huggingface/chat-ui.git
cd chat-ui
Enter fullscreen mode Exit fullscreen mode

We then start a MongoDB database in a Docker container:

docker run -d -p 27017:27017 --name mongo-chatui mongo:latest
Enter fullscreen mode Exit fullscreen mode

Next, we configure our API key and the MongoDB URL in a .env file:

vi .env
MONGODB_URL=mongodb://localhost:27017
HF_ACCESS_TOKEN=<your-token-here>
Enter fullscreen mode Exit fullscreen mode

After installing Node.js dependencies, we start the application:

npm install
npm run dev
Enter fullscreen mode Exit fullscreen mode

Application start

The Chatbot is now up and running, accessible at http://localhost:5173/.

Chat UI home page

We notice the default model in Chat UI is OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5 (that might be the real reason why I selected it earlier...). The conversation history is stored, and responses are not truncated. A dark theme is available as well.

Chat UI dark mode

Further customization is possible by modifying the .env.local file to change the model and other settings. For instance:

MONGODB_URL=mongodb://localhost:27017
HF_ACCESS_TOKEN=<your-token-here>
PUBLIC_ANNOUNCEMENT_BANNERS=
PUBLIC_APP_NAME=My ChatBot
PUBLIC_APP_COLOR=emerald
MODELS=`[
  {
    "name": "HuggingFaceH4/zephyr-7b-beta",
    "datasetName": "HuggingFaceH4/ultrachat",
    "description": "A 7B parameter GPT-like model fine-tuned on a mix of publicly available, synthetic datasets",
    "websiteUrl": "https://huggingface.co/HuggingFaceH4/zephyr-7b-beta",
    "userMessageToken": "<|prompter|>",
    "assistantMessageToken": "<|assistant|>",
    "messageEndToken": "</s>",
    "preprompt": "Below are a series of dialogues between various people and an AI assistant. The AI tries to be helpful, polite, honest, sophisticated, emotionally aware, and humble-but-knowledgeable. The assistant is happy to help with almost anything, and will do its best to understand exactly what is needed. It also tries to avoid giving false or misleading information, and it caveats when it isn't entirely sure about the right answer. That said, the assistant is practical and really does its best, and doesn't let caution get too much in the way of being useful.\n-----\n",
    "promptExamples": [
      {
        "title": "Write an email from bullet list",
        "prompt": "As a restaurant owner, write a professional email to the supplier to get these products every week: \n\n- Wine (x10)\n- Eggs (x24)\n- Bread (x12)"
      }, {
        "title": "Code a snake game",
        "prompt": "Code a basic snake game in python, give explanations for each step."
      }, {
        "title": "Assist in a task",
        "prompt": "How do I make a delicious lemon cheesecake?"
      }
    ],
    "parameters": {
      "temperature": 0.9,
      "top_p": 0.95,
      "repetition_penalty": 1.2,
      "top_k": 50,
      "truncate": 1000,
      "max_new_tokens": 1024
    }
  }
]`
Enter fullscreen mode Exit fullscreen mode

With these changes, we have now personalized our ChatBot's model and behavior. This time a chose the brand new zephyr-7b-beta model released by Hugging Face.

Let's stop and restart our Web application.

Chat UI using Zephyr

We now use the Zephyr model and the previous chat history is still available. Furthermore you have an explanation of the DPO method used to train the Zephyr-7B-β model.

For production I would recommended to deploy the models on dedicated and paid Inference Endpoints, hosted on an infrastructure fully managed by Hugging Face.

Final Thoughts

Building a fully open source ChatBot with Hugging Face components is not only feasible but also a testament to the vibrant open source community and accessible AI technology. By leveraging models from the Model Hub and using the Chat UI, anyone can create a sophisticated and customizable ChatBot.

The future is indeed open, and it's fascinating to see how these tools democratize access to advanced NLP capabilities. The open source DNA of sharing and collaboration is what drives innovation forward, and Hugging Face is at the forefront of this movement.

Feel free to experiment with different models and configurations to suit your specific needs. The possibilities are vast, and the only limit is your imagination.

Stay tuned, and keep exploring the exciting world of Generative AI!

Top comments (1)

Collapse
 
kortizti12 profile image
Kevin

Developing a ChatBot with Hugging Face is amazing! Thanks for the examples - they were clear and concise and I would love to add a few insights.

The Transformers library is a treasure trove for anyone delving into tasks like text classification, question-answering, summarization, translation, and beyond. Hugging Face doesn’t just supply the tools; it offers the means to innovate and push the boundaries of what’s possible in NLP.

There are three key features Hugging Face offers that simplify the process of working with ML data: Datasets, Models, and Spaces.
Image description

Pre-trained models can be used to perform many different tasks, such as:

  • Text: text classification, Q&A, summarization, text generation, sentiment analysis and translation in many different languages
  • Image: image classification, image-to-text, image segmentation, and image detection
  • Audio: audio classification, text-to-speech, and speech recognition.

So for anyone learning how to use Hugging Face, I recommend this article from my partner Nicolas Azevedo, which provides some good examples of Hugging Face: scalablepath.com/machine-learning/... it also provides a nice example of Hugging Face used in the e-commerce industry.