Photo by Andrea De Santis on Unsplash
If you use social media, you may see an image or images generated by machine learning technology recently.
DALLE 2
https://openai.com/dall-e-2/
You can use DALLE 2 for free, but you may need to wait for a month maybe more.
Then recently another one has been released. That is Stable Diffusion. It is pretty similar to DALLE 2. If you give text and some parameters, it generates pretty nice image. You can use Stable Diffusion without waiting for a month which is super nice, right? However, it requires a GPU. If you don't have a GPU or cannot access to a GPU probably you 😭 (What am I supposed to do?)
About Stable Diffusion
https://stability.ai/blog/stable-diffusion-public-release
CompVis / stable-diffusion
A latent text-to-image diffusion model
Stable Diffusion
Stable Diffusion was made possible thanks to a collaboration with Stability AI and Runway and builds upon our previous work:
High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach*
Andreas Blattmann*
Dominik Lorenz,
Patrick Esser,
Björn Ommer
CVPR '22 Oral |
GitHub | arXiv | Project page
Stable Diffusion is a latent text-to-image diffusion model. Thanks to a generous compute donation from Stability AI and support from LAION, we were able to train a Latent Diffusion Model on 512x512 images from a subset of the LAION-5B database. Similar to Google's Imagen, this model uses a frozen CLIP ViT-L/14 text encoder to condition the model on text prompts. With its 860M UNet and 123M text encoder, the model is relatively lightweight and runs on a GPU with at least 10GB VRAM. See this section below and the model card.
Requirements
A suitable…
Then you can try stable_diffusion.openvino
. You don't need a GPU to run this!!!
stable_diffusion.openvino
Implementation of Text-To-Image generation using Stable Diffusion on Intel CPU or GPU.
Requirements
- Linux, Windows, MacOS
- Python <= 3.9.0
- CPU or GPU compatible with OpenVINO.
Install requirements
- Set up and update PIP to the highest version
- Install OpenVINO™ Development Tools 2022.3.0 release with PyPI
- Download requirements
python -m pip install --upgrade pip
pip install openvino-dev[onnx,pytorch]==2022.3.0
pip install -r requirements.txt
Generate image from text description
usage: demo.py [-h] [--model MODEL] [--device DEVICE] [--seed SEED] [--beta-start BETA_START] [--beta-end BETA_END] [--beta-schedule BETA_SCHEDULE]
[--num-inference-steps NUM_INFERENCE_STEPS] [--guidance-scale GUIDANCE_SCALE] [--eta ETA] [--tokenizer TOKENIZER] [--prompt PROMPT] [--params-from PARAMS_FROM]
[--init-image INIT_IMAGE] [--strength STRENGTH] [--mask MASK] [--output OUTPUT]
optional arguments:
-h, --help show this help message and exit
--model MODEL model name
--device DEVICE inference device [CPU, GPU]
--seed SEED random seed for generating consistent images per prompt
--beta-start BETA_START
LMSDiscreteScheduler::beta_start
--beta-end BETA_END LMSDiscreteScheduler::beta_end
--beta-schedule BETA_SCHEDULE
LMSDiscreteScheduler::beta_schedule
--num-inference-steps NUM_INFERENCE_STEPS
num inference steps
--guidance-scale GUIDANCE_SCALE
guidance scale
--eta ETA
…The readme is very straightforward, so probably you won't have any issues to run the demo.py
and try a python script for streamlit
.
However, there might be an issue if you use python already with python version manager and anaconda or etc.
Then, you can use poetry to avoid messing up and keep your python dev env clean.
install poetry
There are 2 ways to install poetry.
- using pip
- using curl
Installation
https://python-poetry.org/docs/#installation
Create a project folder
$ poetry new poetry-stable-diffusion
Install packages
$ poetry add package_name@package_version
However, you don't need to do this. You can use the following pyproject.toml
I tested already.
In this case, I used python 3.8.12.
If you don't have python 3.8, I highly recommend you to install it with [pyenv](https://github.com/pyenv/pyenv)
.
[tool.poetry]
name = "stablediffusion"
version = "0.1.0"
description = "test Stable Diffusion"
authors = ["koji"]
[tool.poetry.dependencies]
python = "^3.8"
numpy = "1.19.5"
transformers = "4.16.2"
diffusers = "0.2.4"
tqdm = "4.64.0"
openvino = "2022.1.0"
huggingface-hub = "0.9.0"
streamlit = "1.12.0"
watchdog = "2.1.9"
opencv-python = "4.5.2.54"
scipy = "1.6.1"
[tool.poetry.dev-dependencies]
[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"
What you need to do set up the env is to run one command!
$ poetry install
Clone repo
$ git clone https://github.com/bes-dev/stable_diffusion.openvino.git
$ cd stable_diffusion.openvino
Run demo.py
$ poetry run python demo.py --prompt "cyberpunk New York City"
generated image
The generating process will take a few minutes (in my case it takes around 3 minutes)
my mac spec
$ system_profiler SPHardwareDataType
Hardware:
Hardware Overview:
Model Name: MacBook Pro
Model Identifier: MacBookPro16,1
Processor Name: 8-Core Intel Core i9
Processor Speed: 2.3 GHz
Number of Processors: 1
Total Number of Cores: 8
L2 Cache (per Core): 256 KB
L3 Cache: 16 MB
Hyper-Threading Technology: Enabled
Memory: 16 GB
System Firmware Version: 1916.0.28.0.0 (iBridge: 20.16.365.5.4,0)
OS Loader Version: 564.40.2.0.1~4
Serial Number (system): C02CP2ESMD6Q
Hardware UUID: FFCE331E-4543-5DBE-8F98-E329E0A69F91
Provisioning UDID: FFCE331E-4543-5DBE-8F98-E329E0A69F91
Activation Lock Status: Disabled
Top comments (2)
Just set this up yesterday in fact!
On my Mac it takes about 3 minutes to generate an image. Brand new feature released now allows you to supply a starter image, and it now comes with a web interface to make things easier. Only 512px x 512px at the moment, but arguments for height and width are on his roadmap.
Considering how new this project is, he’s done an amazing job!
Yeah agree!