DEV Community

0xkoji
0xkoji

Posted on

Run GPT4All on Google Colab

In this article, I'll introduce how to run GPT4ALL on Google Colab.

GitHub logo nomic-ai / gpt4all

gpt4all: open-source LLM chatbots that you can run anywhere

GPT4All

Open-source large language models that run locally on your CPU and nearly any GPU

GPT4All Website and Models

GPT4All Documentation

Discord

πŸ¦œοΈπŸ”— Official Langchain Backend

GPT4All is made possible by our compute partner Paperspace

Run on an M1 macOS Device (not sped up!)

GPT4All: An ecosystem of open-source on-edge large language models.

Important

GPT4All v2.5.0 and newer only supports models in GGUF format (.gguf). Models used with a previous version of GPT4All (.bin extension) will no longer work.

GPT4All is an ecosystem to run powerful and customized large language models that work locally on consumer grade CPUs and any GPU. Note that your CPU needs to support AVX or AVX2 instructions.

Learn more in the documentation.

A GPT4All model is a 3GB - 8GB file that you can download and plug into the GPT4All open-source ecosystem software. Nomic AI supports and maintains this software ecosystem to…

  1. Clone the repo
  2. Download pt4all-lora-quantized.bin
  3. Run gpt4all-lora-quantized-linux-x86
main: seed = 1681043249
llama_model_load: loading model from 'gpt4all-lora-quantized.bin' - please wait ...
llama_model_load: ggml ctx size = 6065.35 MB
llama_model_load: memory_size =  2048.00 MB, n_mem = 65536
llama_model_load: loading model part 1/1 from 'gpt4all-lora-quantized.bin'
llama_model_load: .................................... done
llama_model_load: model size =  4017.27 MB / num tensors = 291

system_info: n_threads = 2 / 2 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 | 
main: interactive mode on.
sampling parameters: temp = 0.100000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.300000


== Running in chat mode. ==
 - Press Ctrl+C to interject at any time.
 - Press Return to return control to LLaMA.
 - If you want to submit another line, end your input in '\'.
hello!

> \n
Thank you for your message, I'll be sure to respond as soon as possible
> You are welcome :) 
I hope my response was helpful. Let me know if there is anything else that we can do! Thank you again for reaching out and allowing us the opportunity to assist with your needs.\n \nThanks,
> What is the capital of the United States of America? \n
\n
The current capital city of the USA is Washington D.C., which was established as a result of an act passed by Congress in 1790 to establish a permanent seat for government functions.\n
> I am sorry, but I do not have enough information about your question or request. Please provide more details and context so that we can better assist you with our services!
> 
Enter fullscreen mode Exit fullscreen mode

clone the repo

!git clone https://github.com/nomic-ai/gpt4all.git
Enter fullscreen mode Exit fullscreen mode

Download the model

This step will take some time so maybe mounting Google Drive would be a better way for this step.

cd gpt4all/chat
!wget https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-quantized.bin
Enter fullscreen mode Exit fullscreen mode

Run the program

!./gpt4all-lora-quantized-linux-x86 
Enter fullscreen mode Exit fullscreen mode

You need to click the stop button to stop the program

Top comments (0)