How to get up & running a LLM locally - in 5 minutes

#llm #chatgpt #mistral #ollama

Video Version:
https://youtube.com/shorts/y0NWVUsfLiU?si=x16bKEoHLfk87nC2

What is Ollama?

It's a lightweight framework designed for those who wish to experiment with, customize, and deploy large language models without the hassle of cloud platforms. With Ollama, the power of AI is distilled into a simple, local package, allowing developers and hobbyists alike to explore the vast capabilities of machine learning models.

Setting Up Ollama: A Step-by-Step Approach

First download ollama for your OS here:
https://ollama.com/download

Second run the model you want with:

ollama run llama2

Model library

Ollama supports a list of models available on ollama.com/library

Here are some example models that can be downloaded:

Model	Parameters	Size	Download Command
Llama 2	7B	3.8GB	`ollama run llama2`
Mistral	7B	4.1GB	`ollama run mistral`
Dolphin Phi	2.7B	1.6GB	`ollama run dolphin-phi`
Phi-2	2.7B	1.7GB	`ollama run phi`
Neural Chat	7B	4.1GB	`ollama run neural-chat`
Starling	7B	4.1GB	`ollama run starling-lm`
Code Llama	7B	3.8GB	`ollama run codellama`
Llama 2 Uncensored	7B	3.8GB	`ollama run llama2-uncensored`
Llama 2 13B	13B	7.3GB	`ollama run llama2:13b`
Llama 2 70B	70B	39GB	`ollama run llama2:70b`
Orca Mini	3B	1.9GB	`ollama run orca-mini`
Vicuna	7B	3.8GB	`ollama run vicuna`
LLaVA	7B	4.5GB	`ollama run llava`
Gemma	2B	1.4GB	`ollama run gemma:2b`
Gemma	7B	4.8GB	`ollama run gemma:7b`

Memory Requirements:
Keep in mind, running these models isn't light on resources. Ensure you have at least 8 GB of RAM for 7B models, and more for the larger ones, to keep your AI running smoothly.

Customization

With Ollama, you're not just running models; you're tailoring them. Import models with ease and customize prompts to fit your specific needs. Fancy a model that responds as Mario? Ollama makes it possible with simple command lines:

Customize a prompt

Models from the Ollama library can be customized with a prompt. For example, to customize the llama2 model:

ollama pull llama2

Create a Modelfile:

FROM llama2


# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1

# set the system message

SYSTEM """
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
"""

Next, create and run the model:

ollama create mario -f ./Modelfile ollama run mario

hi
Hello! It's your friend Mario.

If you liked this content also have a look at my YouTube channel

DEV Community

How to get up & running a LLM locally - in 5 minutes

What is Ollama?

Setting Up Ollama: A Step-by-Step Approach

Model library

Customization

Top comments (0)

Read next

How to Choose the Best Embedding Model for Your LLM Application

Building a Multi-Agent Framework from Scratch with LlamaIndex

Skepticism about Large Language Models (LLM) and ChatGPT

Beyond boring 🙄 markdown rendering with LLMs ✨ and React ⚛️