DEV Community

Cover image for Convert Stable diffusion to TensorFlow-Lite model.
Pushpendra Singh || IIT BHU
Pushpendra Singh || IIT BHU

Posted on

Convert Stable diffusion to TensorFlow-Lite model.

Introduction:

In the near future, the metaverse is going to create a billion-dollar market and almost every giant MNC and visionary startup are working to take leverage of this market. NFTs will be an inevitable part of the metaverse. Have you heard about one of the most famous machine learning algorithms, Stable Diffusion which is used to create digital artworks or NFTs?

In this article, I am going to tell you about stable diffusion and its conversion to the TensorFlow-Lite model.

Let’s get started.

Table of contents:

  • What is Stable Diffusion?
  • Reason to convert stable diffusion to the TensorFlow-lite model?
  • Convert stable diffusion to TensorFlow-lite.
  • Inferencing the stable diffusion TensorFlow-lite model.
  • Further optimizations.

What is Stable Diffusion?

Ever wondered “What an astronaut riding a horse on mars” may look like? Me neither .…

Let’s find out,

It is an image in which an astronaut is riding a horse on the planet mars.

Oops… How will the horse breathe in oxygen on Mars without an astronaut suit? It’s a martian horse and it can survive on mars😜😜.

Stable Diffusion is a diffusion-based machine learning algorithm that generates realistic images based on text. It works both as Text-to-Image and Image-to-Image. Most of the recent AI art on the internet is generated using the Stable Diffusion model. Since it is an open-source tool, anyone can easily create fantastic art illustrations from just a text prompt. You can play around with stable diffusion by going to my repository, here.

It is a very heavy model and running it on a CPU takes a lot of time that is why we prefer running it on GPU.

The latest stable diffusion 2.0 offers a lot of functionalities. You can explore them here.

Reason to convert stable diffusion to TensorFlow lite.

Suppose we want to generate images using stable diffusion at some remote locations where the internet is not available. Or suppose you want to decrease the run-time of stable diffusion. How will you do that?

The answer is we will convert our model to TensorFlow lite. TF Lite optimizes existing models to be less memory and cost-consuming. TensorFlow lite models enable us to do edge inferencing that does not rely on an internet connection. Mobile-friendly deployment becomes easier with TensorFlow Lite.

The main reason for us at Qolaba.io is to convert stable diffusion to TensorFlow lite and then use WasmEdge runtime for WebAssembly programs on a variety of platforms, including desktop and mobile devices, servers, and browsers.

WasmEdge is a lightweight, high-performance, and extensible WebAssembly runtime for cloud-native, edge, and decentralized applications. It powers serverless apps, embedded functions, microservices, smart contracts, and IoT devices. WasmEdge is currently a CNCF (Cloud Native Computing Foundation) Sandbox project.

Wasmedge supports a variety of programming languages, including Rust, C, C++, and Python, and can be used as a standalone runtime or embedded into other programs.

Overall, the Wasmedge runtime provides a secure, portable, and efficient platform for running WebAssembly code, with the flexibility to use the programming language of your choice.

Okay.. so the main question arises, how do we convert a stable diffusion model to TensorFlow-lite?

Converting stable diffusion to TensorFlow-lite.

Just follow the steps given below…

1. Clone the repository.

!git clone https://github.com/pushpendra910/Stable-Diffusion.git
Enter fullscreen mode Exit fullscreen mode

2. Setting GPU.

Make sure you have set up your GPU. To set up GPU you need to install Tensorflow, CUDA, and cuDNN I recommend you check the versions of the libraries and python that you are using so that there is no version mismatch otherwise your code will not run on GPU. To check for the version go to this link or look at the below table.

2.1 GPU already Set up.
If you have already set up your GPU then run the following command in the terminal and move to step 3.

pip install -r requirements_without_tf.txt
Enter fullscreen mode Exit fullscreen mode

2.2 Setup TensorFlow and GPU.

To install TensorFlow and other libraries create a new environment with python 3.7 and above and run the commands… We will be working with TensorFlow==2.10.0.

pip install -r requirements.txt
Enter fullscreen mode Exit fullscreen mode

To complete your GPU setup, run the commands.

pip install tensorflow tensorflow_addons ftfy --upgrade --quiet
!sudo apt install --allow-change-held-packages libcudnn8=8.1.0.77-1+cuda11.2
Enter fullscreen mode Exit fullscreen mode

3. Instantiate the stable diffusion model and save it in a folder.

  • Go to the terminal and run the following command.
python main.py
Enter fullscreen mode Exit fullscreen mode

4. Now convert your saved models to TensorFlow Lite models.

  • Go to the terminal and run the command.
python to-tflite.py
Enter fullscreen mode Exit fullscreen mode

Very Good, you have converted stable diffusion to TensorFlow Lite. It’s time to inference it.

Inferencing the stable diffusion TensorFlow lite model.

  • Go to the terminal and run the following command.
python inferencing_tflite.py
Enter fullscreen mode Exit fullscreen mode

You will now notice that the inferencing will take a lot of time on GPU, this is simply because TensorFlow Lite models are not running on GPU. TensorFlow Lite enables the use of GPUs and other specialized processors only through hardware drivers called delegates but TFLite GPU delegate only supports Android and iOS GPU.

So another way to reduce the inferencing time is to run the TensorFlow Lite models on a multi-core CPU, which will surely take less time.

Further Optimizations.

Now we have successfully used a stable diffusion model in TensorFlow Lite format, but the inference time is high, so we need to reduce it. For that purpose, we can try

  • increasing the number of GPU/CPU cores.

  • Running the code on android/ios GPU via Tflite delegates.

  • Converting our model to onnx and then using onnxruntime-gpu.

Once this inference time is less, then we can use WasmEdge runtime.

Try these suggestions on your own and do let me know about your progress or any problem you face on LinkedIn, Instagram, or Twitter.

Okay, that’s what I had for now. See you in another exciting article. Bye Bye….🙋‍♂️

Thankyou🙏

Top comments (0)