DEV Community

Cover image for Very Simple Local AI Chatbot With Transformers
Code Boxx
Code Boxx

Posted on • Updated on

Very Simple Local AI Chatbot With Transformers

CROUCHING TIGER, HIDDEN DRAGON

  • It all began with my curiosity for PrivateGPT.
  • As things turn out, the real MVPs are 2 libraries, LangChain and Transformers.
  • Here is how to build a super simple local chatbot using Transformers only.

PART 1) REQUIREMENTS

PART 2) PROJECT SETUP

  • Create a project folder. E.G. C:\CHATBOT
  • Open terminal, navigate to project folder. cd C:\CHATBOT
  • Create a virtual environment and activate it.
    • virtualenv venv
    • Windows - venv\Scripts\activate
    • Linux/Mac - venv/bin/activate
  • Install transformers - pip install transformers optimum auto-gptq
  • Install PyTorch.
    • Head over to PyTorch Get Started Locally, get the "correct pip install".
    • Yes, PyTorch with CPU/GPU support are different.
    • E.G. For Windows with CUDA support - pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117

PART 3) SCRIPT

simple.py

# (A) LOAD MODULES
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

# (B) MODEL + TOKENIZER
model_name = "TheBloke/Wizard-Vicuna-7B-Uncensored-GPTQ"
model = AutoModelForCausalLM.from_pretrained(
  model_name,
  torch_dtype = torch.float16,
  device_map = "auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# (C) PIPE
pipe = pipeline(
  task = "text-generation",
  model = model,
  tokenizer = tokenizer,
  do_sample = True,
  max_new_tokens = 1000
)

# (D) RUN QUERY
while True:
  query = input("\nEnter a query: ")
  if query == "exit":
    break
  if query.strip() == "":
    continue
  print(pipe(query))
Enter fullscreen mode Exit fullscreen mode
  • (A) Load PyTorch and Transformers.
  • (B) Load the model and tokenizer - we will use a simple Wizard-Vicuna model.
  • (C) Put the model and tokenizer into a pipe.
  • (D) Endless loop, pass a query into the pipe and get response from the AI... Enter exit to stop.

PART 4) RUN!

AI Answers A Human

  • python simple.py
  • Transformer will automatically download your selected model... So be warned, that will be a few gigabytes and will take some time.
  • If you want to change where the model is downloaded, add these right at the very top:
    • import os
    • os.environ["TRANSFORMERS_CACHE"] = "PATH\TO\MODELS"

USE A DIFFERENT AI MODEL

  • Head over to Hugging Face, choose a model.
    • GGML Models optimized for CPU.
    • GPTQ Models optimized for GPU.
    • GGUF Newer version/replacement for GGML.
    • CHAT Models with "chat" in the name are tuned to do chat.
    • CODE Models to provide coding assistance.
    • MATH To do calculations.
    • 7B 13B 34B 70B Number of parameters. The more, the "smarter"... Technically speaking. But more parameters also = Need more system resources.
  • In any case, the transformers library seems to only support GPTQ libraries and some specific ones like meta-llama (at the time of writing).
  • A few popular models/devs:
  • Once you have chosen a model, just replace model_name with the URL path/suffix. E.G. TheBloke/vicuna-7B-v1.5-GPTQ.

THE END

Congrats! You have created a LOCAL AI chatbot with 30 lines of code. But AI is a lot more capable than that, if you want to learn more - Here is the detailed tutorial on my blog and the GIST.

Top comments (0)