This article was co-authored by @makkukuma
Introduction
The proliferation of Artificial Intelligence (AI) emerges as the next frontier that revolutionizes our technological society. It is continuously evolving and will likely live for a long time. One notable feature that AI can provide is the generation of AI-generated images. In this article, we will tackle what AI Image Generation is and how it works. This article will also tackle how to create your own AI-generated images using Stable Diffusion on your local machine.
What are AI-Generated Images and How AI Image Generation Works
AI-generated images are developed using artificial intelligence (AI) algorithms. These algorithms can generate new, realistic-looking images by training patterns, styles, and characteristics from large datasets of existing images. Stable Diffusion works by progressively adding noise to an image for a set amount of steps until the image becomes completely unrecognizable then reversing the image back to its original, and it use text prompts to influence image output.
The Prerequisites
Before we begin on creating our first image, the following must be installed on your computer:
- Github Desktop / Git
- Python 3.10
Once you have downloaded these applications, you can now proceed with the next steps below.
Step 1 - Clone Stable Diffusion
Method 1: Github Desktop
Open Github Desktop on your computer and clone the following:
https://github.com/AUTOMATIC1111/stable-diffusion-webui
Put the clone location to any drive that has at least 10GB space of free storage.
Method 2 : Git
For the Git alternative, simply right-click on the location you want to put the Stable Diffusion and select “Git Bash Here”, then paste this on the CLI:
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui
Step 2 - Download your preferred models
Models are collected data or ‘weights’ of what the AI has learned or has been trained from a certain set of various images. For example, realistic.ckpt (checkpoint file) is trained from giving the AI images of real life. If you told the AI to create something from real life, it would do so. However, if you told the AI to create a 2D image or a Disney Character for example, it could not properly do so as it is not trained to. Models are basically that one scene from the Matrix where Neo tells Morpheus that he now knows Kung Fu after simply downloading the knowledge. Same goes for our AI.
There are various models on the internet right now that could be downloaded without charge. In our example, let’s use the default model of Stable Diffusion called SD 2.1
https://huggingface.co/webui/stable-diffusion-2-1/resolve/main/v2-1_768-ema-pruned-fp16.safetensors
You might want to wait for a bit if you have slow internet speeds, as the model could be around 4.0GB
Step 3 - Renaming and moving your models
The downloaded model will have a name similar to this:
Rename your downloaded model to “model.ckpt” or “model.safetensors”. This way, it will let the AI know that this is the default model. If you have more than one model, there is no need to rename them all.
Copy or cut the renamed model and transfer them to the cloned Stable Diffusion from your Github repository. Move them to:
…/models/Stable-diffusion
Step 4 - Launching webui-user.bat
Launch webui-user.bat and wait for the dependencies to install itself. It takes around 15 to 25 minutes tops to install. It may sometimes appear stuck at installing, but it is not. Once finished, your local AI is ready!
Image Prompting and Creation
UI basics
Once the installation is finished, the webui will automatically open using your default browser. The UI should appear as shown:
The UI for Stable Diffusion contains the following:
- Stable Diffusion Checkpoint Selector (1)
This dropdown menu allows you to select the models you want to use for the image generation.
- Mode Option Nav bar (2)
txt2img - allows generation from text to image prompts
img2img - allows you to insert an image and alter or generate a new image based on the contents of that image.
Extras - contains multiple options for AI resizing, recolor, fixing, etc.
PNG Info - extracts the prompts from a generated image
Checkpoint Merger - merges models into a new supermodel
Train - allows you to train your own models with your own image data inputs
Settings - contains extensive settings for different types of AI image generators
Extensions - allows the use of optimizers for models
- Positive and Negative Prompt Box (3)
Here is where you will put your text prompts.
- Generation Settings (4)
This area contains the settings for the generated image. (dimensions, sampling method, etc.)
- Output Box (5)
This area contains the generated image and some options on what to do with it.
- Start Generate / Generated Options Box (6)
This area allows you to start generating. You could also insert “styles”, which are premade text prompts and settings that you created for ease of reuse.
Prompt basics
Let us start by knowing what positive and negative prompts are.
As seen from the UI, the first prompt box is the “Positive” prompts, which the AI will use to align the desired image. The “Negative” prompt box allows you to insert text prompts that you want to see less in a generated image.
For example:
Positive prompts : HD, masterpiece, best quality, high quality, (sunrise at sea), (waves), birds flying in the distance, morning
Negative prompts : lowres, low quality, incorrect landscape, {{deformed birds}}
The parenthesis indicate that you want this prompt to appear more than the usual 1x in the model. 3 parenthesis means that text prompt would be present in the generated image 3x than the standard. The same goes for the curly braces, although the opposite happens. Curly braces indicate that you want to see that certain prompt less.
Example
Finally, to the image generation, we will use the example prompt above to generate our first “hello world” image.
- Go to the checkpoint selector dropdown menu and select the default model (model.safetensors/model.ckpt)
- Insert our positive and negative prompts to their respective input boxes.
- Select a sampling method. There are a lot of methods in the menu, but we will only use Euler A or any default method currently selected in the dropdown menu.
- Increase the sampling steps to around 100 to 150. Sampling steps are the number of iterations where the AI will go over the generating image to refine it more to the desired prompts. We will be using 150.
- Set to desired resolution. The higher the resolution, the more time and resources it takes to generate. The golden standard in AI generation is the 512x512 resolution, as seen on online image generators. We will be using 1032x784 resolution.
- Set CFG scale to 7-15. The CFG scale (classifier guidance scale) is a parameter of how much the AI should follow the prompt to the letter. The lesser it is, the more it deviates from the prompt. The higher it is, the more it is literal to the prompt.
- Set Seed to -1. Similar to the seed system used on any procedurally generated map in games. Setting the seed to -1 will allow random seed generation. Setting it to a certain seed, you will get the similar or sometimes exact image based on that prompt on that seed.
- Go to settings > Stable Diffusion > Check the tick-box for “Upcast cross attention layer to float32” and set the slider for Clip Skip to 2
- Start the generation process!
The generated output should look something like this:
Below the image shows the settings done for that generated image. You can copy the exact seed and settings done in that image and generate it on your machine to create the exact same thing shown above!
You now have your very own AI image generated piece on your machine. Unlike its online counterparts, you could use the output image you just prompted and refine it more in the img2img section of the webui. Feel free to experiment with the settings, different sampling methods and prompts, or download different models to generate your own images!
Top comments (1)
Artificial Intelligence (AI) is heralded as the next frontier reshaping our technological landscape, with its evolution promising longevity. A standout capability lies in AI-generated images. This article delves into the intricacies of AI Image Generation, exploring its mechanics and applications. Additionally, it offers insights into creating personalized AI-generated images using Stable Diffusion on local devices.