DEV Community

Cover image for Generate Photos Of Yourself In Different Cities & Different Fancy Suits With SDXL DreamBooth Training For Free
Furkan Gözükara
Furkan Gözükara

Posted on

Generate Photos Of Yourself In Different Cities & Different Fancy Suits With SDXL DreamBooth Training For Free

To train this model I have used Stable Diffusion XL (SDXL) 1.0 Base model. I still find base model best for realism.

I used the configuration I shared here currently. It has been discovered after 120 different full trainings experimentation : https://www.patreon.com/posts/very-best-for-of-89213064

Hopefully I will make a full big tutorial of DreamBooth training and show all the settings.

For training Kohya SS GUI is used. If you don’t know how to install and use Kohya SS GUI here the tutorial you need > https://youtu.be/sBFGitIvD2A

And here is the tutorial that shows how to use shared configuration both on Windows and RunPod > https://www.youtube.com/watch?v=EEV8RPohsbw

 

If you are not my Patreon supporter currently I have full public tutorial for LoRA training of SDXL that requires 12 GB VRAM > https://youtu.be/sBFGitIvD2A

If you don’t have a strong GPU and PC you can use Free Kaggle notebook for SDXL DreamBooth training > https://youtu.be/16-b1AjvyBE

So I used the below images as training images dataset.

This training dataset is at best medium quality because it has repeating backgrounds, repeating clothing and missing fully body pose. All these causes overtraining and lesser generalization.

Then for training configuration I did the following.

When training with regularization images, I did 1 epoch and 150 repeating. So more regularization images are used. And saved checkpoints at every 30 epochs. To make it that way calculate the number of steps you need to save. It is easy. 15 * 30 * 2 + 1

The images are generated from epoch 150 so it is a little bit overtrained.

I have used ultra high quality manually picked Unsplash real images having man dataset as regularization images dataset. This means they were ground truth regularization images like how Stable Diffusion was initially trained.

You can download dataset from > https://www.patreon.com/posts/massive-4k-woman-87700469

5200 images for both man and woman. Perfect quality cropped and resized to many common resolutions.

All training images and reg images were 1024x1024 pixels.

Used dynamic prompting so multiple color suits are generated.

Used prompts are as below.

Positive

ohwx:0.98 man slightly smiling wearing and wearing an expensive {red|blue|white|black|brown|gold} suit , photoshoot in a sunny day in a city , hd, hdr, uhd, 2k, 4k, 8k

Negative

sunglasses, illustration, 3d, 2d, painting, cartoons, sketch, squinting eyes, blurry

In the after detailer (ADetailer) the following settings are used.

Prompt

photo of ohwx man slightly smiling

0.7 denoiose and 29 steps. Rest are default.

Here below full png info

parameters

ohwx:0.98 man slightly smiling wearing and wearing an expensive gold suit , photoshoot in a sunny day in a city , hd, hdr, uhd, 2k, 4k, 8k,
Negative prompt: sunglasses, illustration, 3d, 2d, painting, cartoons, sketch, squinting eyes, blurry
Steps: 20, Sampler: DPM++ 2M SDE Karras, CFG scale: 7, Seed: 744692608, Size: 1024x1024, Model hash: eef545047f, Model: me15img_150repeat, ADetailer model: face_yolov8n.pt, ADetailer prompt: photo of ohwx man slightly smiling, ADetailer confidence: 0.3, ADetailer mask only top k largest: 1, ADetailer dilate erode: 4, ADetailer mask blur: 4, ADetailer denoising strength: 0.7, ADetailer inpaint only masked: True, ADetailer inpaint padding: 32, ADetailer use separate steps: True, ADetailer steps: 29, ADetailer version: 23.11.1, Version: v1.7.0-133-gde03882d

And here the results.

Top comments (0)