Kyosuke Takayama

Posted on Aug 24, 2022

Running Stable Diffusion on M1 MacBook Pro

#stablediffusion

original article here: https://zenn.dev/ktakayama/articles/6c627e0956f32c

AI image generator Stable Diffusion is now open source. I want to running it on local machine but I only have a MacBook Pro then not to easy.

https://github.com/CompVis/stable-diffusion

The following thread is very helpful!

https://github.com/CompVis/stable-diffusion/issues/25

Speed

Here is my MacBook Pro 14 spec.

Apple M1 Pro chip
8 core CPU with 6 performance cores and 2 efficiency cores
14-core GPU
16-core Neural Engine
32GB memory

It needs about 15–20 GB of memory while generating images. 6 images can be generated in about 5 minutes.

Get model

https://huggingface.co/CompVis/stable-diffusion-v-1-4-original

Get source code

Get the source code in the apple-silicon-mps-support branch of this repository.

https://github.com/magnusviri/stable-diffusion/tree/apple-silicon-mps-support

Setup

Install conda and rust with homebrew.

brew install miniconda rust

Setup shell environment for conda. I use zsh.

conda init zsh

When I run conda env create, I get an error.

$ conda env create -f environment-mac.yaml
Collecting package metadata (repodata.json): done
Solving environment: failed

ResolvePackageNotFound:
  - python=3.8.5

Edit environment-mac.yaml to match your environment. Specifically, change the version number to match your environment. For example.

diff --git a/environment-mac.yaml b/environment-mac.yaml
index d923d56..c8a0a8e 100644
--- a/environment-mac.yaml
+++ b/environment-mac.yaml
@@ -3,14 +3,14 @@ channels:
   - pytorch
   - defaults
 dependencies:
-  - python=3.8.5
-  - pip=20.3
+  - python=3.9.12
+  - pip=21.2.4
   - pytorch=1.12.1
   - torchvision=0.13.1
   - numpy=1.19.2
   - pip:
     - albumentations==0.4.3
-    - opencv-python==4.1.2.30
+    - opencv-python>=4.1.2.30
     - pudb==2019.2
     - imageio==2.9.0
     - imageio-ffmpeg==0.4.2

activate and link to model.

conda activate ldm
mkdir -p models/ldm/stable-diffusion-v1
ln -s /path/to/stable-diffusion-v-1-4-original/sd-v1-4.ckpt models/ldm/stable-diffusion-v1/model.ckpt

Do image generation!

I get a PyTorch related error when I execute txt2image.

$ python scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms 
〜 skip 〜
NotImplementedError: The operator 'aten::index.Tensor' is not current implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.

Install nightly version.

conda install pytorch torchvision torchaudio -c pytorch-nightly

This still got error.

$ python scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms 
    return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

Fix this error.

https://github.com/CompVis/stable-diffusion/issues/25#issuecomment-1221667017

vi /opt/homebrew/Caskroom/miniconda/base/envs/ldm/lib/python3.9/site-packages/torch/nn/functional.py

--- functional.py_      2022-08-23 17:07:29.000000000 +0900
+++ functional.py       2022-08-23 17:07:31.000000000 +0900
@@ -2506,9 +2506,9 @@ def layer_norm(
     """
     if has_torch_function_variadic(input, weight, bias):
         return handle_torch_function(
-            layer_norm, (input, weight, bias), input, normalized_shape, weight=weight, bias=bias, eps=eps
+            layer_norm, (input.contiguous(), weight, bias), input, normalized_shape, weight=weight, bias=bias, eps=eps
         )
-    return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
+    return torch.layer_norm(input.contiguous(), normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)

Everything OK! Great!!!

$ python scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms 
...
Your samples are ready and waiting for you here:
outputs/txt2img-samples

Enjoy.

Top comments (1)

Tobias Nickel • Oct 24 '22

can I ask your for a little more benchmarking? you say 6 images in 5 minutes, but what are the settings? 512x512 with 50 steps? This is about the same speed as my 4 year old msi notebook with a gtx1060 12gb.

DEV Community

Running Stable Diffusion on M1 MacBook Pro

Speed

Get model

Get source code

Setup

Do image generation!

Top comments (1)

Read next

Will AI replace programmers by 2030?

Graphile -- Farewell Business Layer

I'm building a "PostgREST for any database" platform

.NET versiyalari