DEV Community

Kyosuke Takayama
Kyosuke Takayama

Posted on

Running Stable Diffusion on M1 MacBook Pro

original article here: https://zenn.dev/ktakayama/articles/6c627e0956f32c


AI image generator Stable Diffusion is now open source. I want to running it on local machine but I only have a MacBook Pro then not to easy.

https://github.com/CompVis/stable-diffusion

The following thread is very helpful!

https://github.com/CompVis/stable-diffusion/issues/25

Speed

Here is my MacBook Pro 14 spec.

  • Apple M1 Pro chip
  • 8 core CPU with 6 performance cores and 2 efficiency cores
  • 14-core GPU
  • 16-core Neural Engine
  • 32GB memory

It needs about 15–20 GB of memory while generating images. 6 images can be generated in about 5 minutes.

Get model

Register and clone this repository.

https://huggingface.co/CompVis/stable-diffusion-v-1-4-original

Get source code

Get the source code in the apple-silicon-mps-support branch of this repository.

https://github.com/magnusviri/stable-diffusion/tree/apple-silicon-mps-support

Setup

Install conda and rust with homebrew.

brew install miniconda rust
Enter fullscreen mode Exit fullscreen mode

Setup shell environment for conda. I use zsh.

conda init zsh
Enter fullscreen mode Exit fullscreen mode

When I run conda env create, I get an error.

$ conda env create -f environment-mac.yaml
Collecting package metadata (repodata.json): done
Solving environment: failed

ResolvePackageNotFound:
  - python=3.8.5
Enter fullscreen mode Exit fullscreen mode

Edit environment-mac.yaml to match your environment. Specifically, change the version number to match your environment. For example.

diff --git a/environment-mac.yaml b/environment-mac.yaml
index d923d56..c8a0a8e 100644
--- a/environment-mac.yaml
+++ b/environment-mac.yaml
@@ -3,14 +3,14 @@ channels:
   - pytorch
   - defaults
 dependencies:
-  - python=3.8.5
-  - pip=20.3
+  - python=3.9.12
+  - pip=21.2.4
   - pytorch=1.12.1
   - torchvision=0.13.1
   - numpy=1.19.2
   - pip:
     - albumentations==0.4.3
-    - opencv-python==4.1.2.30
+    - opencv-python>=4.1.2.30
     - pudb==2019.2
     - imageio==2.9.0
     - imageio-ffmpeg==0.4.2
Enter fullscreen mode Exit fullscreen mode

activate and link to model.

conda activate ldm
mkdir -p models/ldm/stable-diffusion-v1
ln -s /path/to/stable-diffusion-v-1-4-original/sd-v1-4.ckpt models/ldm/stable-diffusion-v1/model.ckpt
Enter fullscreen mode Exit fullscreen mode

Do image generation!

I get a PyTorch related error when I execute txt2image.

$ python scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms 
〜 skip 〜
NotImplementedError: The operator 'aten::index.Tensor' is not current implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.
Enter fullscreen mode Exit fullscreen mode

Install nightly version.

conda install pytorch torchvision torchaudio -c pytorch-nightly
Enter fullscreen mode Exit fullscreen mode

This still got error.

$ python scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms 
    return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.
Enter fullscreen mode Exit fullscreen mode

Fix this error.

https://github.com/CompVis/stable-diffusion/issues/25#issuecomment-1221667017

vi /opt/homebrew/Caskroom/miniconda/base/envs/ldm/lib/python3.9/site-packages/torch/nn/functional.py
Enter fullscreen mode Exit fullscreen mode
--- functional.py_      2022-08-23 17:07:29.000000000 +0900
+++ functional.py       2022-08-23 17:07:31.000000000 +0900
@@ -2506,9 +2506,9 @@ def layer_norm(
     """
     if has_torch_function_variadic(input, weight, bias):
         return handle_torch_function(
-            layer_norm, (input, weight, bias), input, normalized_shape, weight=weight, bias=bias, eps=eps
+            layer_norm, (input.contiguous(), weight, bias), input, normalized_shape, weight=weight, bias=bias, eps=eps
         )
-    return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
+    return torch.layer_norm(input.contiguous(), normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
Enter fullscreen mode Exit fullscreen mode

Everything OK! Great!!!

$ python scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms 
...
Your samples are ready and waiting for you here:
outputs/txt2img-samples

Enjoy.
Enter fullscreen mode Exit fullscreen mode

Top comments (1)

Collapse
 
bias profile image
Tobias Nickel

can I ask your for a little more benchmarking? you say 6 images in 5 minutes, but what are the settings? 512x512 with 50 steps? This is about the same speed as my 4 year old msi notebook with a gtx1060 12gb.