DEV Community

Cover image for Introducing LM Studio
raphiki for Technology at Worldline

Posted on

Introducing LM Studio

This brief article presents LM Studio, a handy tool for installing and testing open source LLMs on your desktop. LM Studio, which is not open source, utilizes the popular llama.cpp library. Be aware that the terms of use prohibit modifications, redistribution, and commercial use of LM Studio. Additionally, users are given a 30-day notice period before any changes to these terms take effect.

LM Studio Logo

LM Studio is user-friendly and available in binary format for Windows and Mac, with a Linux version in the works. It supports various models compatible with the ggml tensor library from the llama.cpp project and requires 16 GB of RAM.

Let's explore how LM Studio is both easy to use and convenient.

Using Models from the Chat panel

After installation, LM Studio facilitates the downloading of models from the Hugging Face Hub, including preset options.

For example, we can download the Zephyr 7B β model, adapted by TheBloke for llama.cpp's GGUF format.

Downloading a Model

Activating and loading the model into LM Studio is straightforward.

Loading a Model

You can then immediately start using the model from the Chat panel, no Internet connection required.

Chatting with Zephyr

The right panel displays and allows modification of default presets for the model. Memory usage and useful inference metrics are shown in the window's title and below the Chat panel, respectively.

Other models, like codellama Instruct 7B, are also available for download and use.

Using Codellama

LM Studio also highlights new models and versions from Hugging Face, making it an invaluable tool for discovering and testing the latest releases.

Accessing Models with APIs

A notable feature of LM Studio is the ability to create Local Inference Servers with just a click.

Local Inference Server

The Automatic Prompt Formatting option simplifies prompt construction to match the model's expected format. The exposed API aligns with the OpenAI format.

Here's an example of calling the endpoint with CURL:

curl http://localhost:1234/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{ 
  "messages": [ 
    { "role": "system", "content": "You are an AI assistant answering Tech questions" },
    { "role": "user", "content": "What is Java?" }
  "temperature": 0.7, 
  "max_tokens": -1,
  "stream": false
Enter fullscreen mode Exit fullscreen mode

The response provides the requested information:

    "id": "chatcmpl-iyvpdtqs1qzlv6jqkmdt9",
    "object": "chat.completion",
    "created": 1699806651,
    "model": "~/.cache/lm-studio/models/TheBloke/zephyr-7B-beta-GGUF/zephyr-7b-beta.Q4_K_S.gguf",
    "choices": [
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "Java is a high-level, object-oriented
                                programming language that was first released by Sun
                                Microsystems in 1995. It is now owned by Oracle Corporation.
                                Java is designed to be platform independent, meaning that it
                                can run on any operating system that has a Java Virtual
                                Machine (JVM) installed. Java's primary applications are in
                                the development of desktop applications, web applications,
                                and mobile apps using frameworks such as Android Studio,
                                Spring Boot, and Apache Struts. Its syntax is similar to
                                C++, but with added features for object-oriented programming
                                and memory management that make it easier to learn and use
                                than C++. Java's popularity is due in part to its extensive
                                library of pre-written code (known as the Java Class
                                Library) which makes development faster and more efficient."
            "finish_reason": "stop"
    "usage": {
        "prompt_tokens": 0,
        "completion_tokens": 166,
        "total_tokens": 166
Enter fullscreen mode Exit fullscreen mode

This feature greatly aids in testing integrations with frontends like chatbots or workflow solutions like Flowise.


Although not open source, LM Studio is a robust addition to your local toolkit, allowing you to easily experiment with and adopt models from Hugging Face. Its user-friendly interface and versatile features make it an essential resource for anyone looking to delve into the world of large language models.

Top comments (1)

eerk profile image

This is a pretty cool tool, I was just wondering if you can also use it to create embeddings of your own text documents?