Host a serverless stable diffusion image search Bot on Vercel

#serverless #python #stablediffusion #api

In my previous post: Host a Serverless Flask App on Vercel, I showed how we can deploy a simple Flask app on Vercel by using Serverless Functions. In this post, I am going to show how to host a Telegram bot on Vercel that searches for AI generated images on Lexica and sends them to the user.

Lexica is a search engine and art gallery for artwork made by Stable Diffusion, one of the more popular AI art models. Lexica claims to have indexed over ten million Stable Diffusion photos, so there's a high possibility that someone has already explored what you're looking for or something similar. If you go to lexica.art in your browser, you can scroll down to see recently added art. By clicking on an image, you can see the whole prompt that was used to build the image. I recommend everyone check out Lexica and try playing with the website to get a gist of what we are building.

Prerequisites

Before we begin, you should have the following things set up:

A Vercel account. You can sign up for a free account at vercel.com.
The Vercel CLI (Command Line Interface) installed on your computer. You can install the Vercel CLI by running the following command in your terminal.

npm install -g vercel

A Telegram Bot Account.
- Go to Telegram and in the search bar enter BotFather.
- Click on BotFather and then on the Start button.
- Type /newbot to BotFather and follow the prompts in entering the bot name and username.
- You will get a message from BotFather like this with your Token to access the Bot. Save the token carefully and don't share it with anyone.

Interacting with Lexica's API.

Lexica has this amazing API where one can send a simple GET request with a prompt as a parameter and it will return a JSON object containing an array of 50 images closely matching the prompt. You can read more about their API at lexica.art/docs.

Suppose if you visit this link in your browser: lexica.art/api/v1/search?q=apples. You will get a JSON response back consisting of an array of images that match "apples". To understand the schema of the JSON response, you can use Postman or JSONHero.

Inside the images attribute of the JSON response is an array of 50 images. Further, each image has its own attributes, like src, height, width, prompt, etc. We are just interested in src and prompt for an image.

So suppose if the JSON response is stored in, say, a variable called resp. We can access the first image URL by accessing resp["images"][0]["src"] and resp["images"][0]["prompt"] to get its prompt.

Create a local Flask app for testing.

You can create a local Flask app and use NGROK/Cloudflare tunnels for port forwarding to get a public URL. I'm not going to go over that now because it's beyond the scope of this article to go into installing and configuring them.
You can use replit.com to get a live public URL during testing. The reason why we are not using Replit to host our Telegram Bot is Repls sleep after a period of inactivity and thus have a long cold start (waiting time) when a new request comes in. It is perfectly fine to host the bot completely on Replit.
You can fork my Flask starter on Replit and follow my steps.
We need to setup a webhook for our Telegram bot to send messages to our Replit server when someone sends any message to the bot.
Edit the following link with your Bot Token that you got from Bot Father and the URL with your Replit/NGROK Website.
https://api.telegram.org/bot<Your Bot Token>/setWebhook?url=<URL that you got from Replit>
After entering the link in your web browser, you will get the following response:

{"ok":true, "result" : true, "description" : "Webhook was set"}

To visualise the request that Telegram sends to the Webhook server, let us print the JSON schema into the console.
Change main.py in your Replit with the following code.

from flask import Flask
from flask import request
from flask import Response

app = Flask(__name__)

@app.post('/')
def index():
    msg = request.get_json()
    print(msg)
    return Response('OK', status=200)


app.run('0.0.0.0', 8080)

You can get the link to your Telegram bot in the message from BotFather, where it says: Congratulations on your new bot. You will find it at t.me/YourBotLink. Then click on the link and then on the start button. Type anything to your bot, such as "Hi/Hello".
Now come to Replit, and the Flask server will print a JSON object in the console.
We can see see the response in the response's -> message -> text attribute and the chatID of the user in response's -> message -> chat -> id
Now let us try returning something from the bot to the user. Now change the main.py file to:

from flask import Flask
from flask import request
from flask import Response
import requests

TOKEN = "Your API TOKEN from BotFather"
app = Flask(__name__)

def sendMessage(chat_id, text):
    url = f'https://api.telegram.org/bot{TOKEN}/sendMessage'
    payload = {'chat_id': chat_id, 'text': text}
    r = requests.post(url, json=payload)
    return r

@app.post('/')
def index():
    msg = request.get_json()
    chat_id = msg['message']['chat']['id'] 
    txt = msg['message']['text']
    sendMessage(chat_id, 'You have typed: ' + txt)
    return Response('ok', status=200)

app.run('0.0.0.0', 8080)

sendMessage is a simple function that sends the message to the user using the /sendMessage method. Though not required for this tutorial, you can read the documentation of the Telegram BOT API to find more methods such as sending videos, files, etc.
Now send a new message to your bot from Telgram, like "Hi". Your Bot should automatically reply to you as "You have typed: Hi".
Similarily, after going through the documentation of Telegram BOT API, we are able to send images to the user using /sendMediaGroup method. The images are received by making a GET request to Lexica's API with the prompt that the user enters and sending the first five images from that response to Telegram. Change main.py finally as:

import os
import json
import requests
from flask import Flask
from flask import request
from flask import Response

app = Flask(__name__)
TOKEN = "Your API TOKEN from BotFather"

def imageAsDict(imageURL, caption):
    return {
        "type": "photo",
        "media": imageURL,
        "caption": caption,
    }


def sendMediaGroup(chatid, allImages):
    url = f"https://api.telegram.org/bot{TOKEN}/sendMediaGroup"
    media = [imageAsDict(allImages[i]["src"], allImages[i]["prompt"]) for i in range(5)]
    payload = {"chat_id": chatid, "media": media}
    r = requests.post(url, json=payload)
    return r

def sendMessage(chat_id, text):
    url = f"https://api.telegram.org/bot{TOKEN}/sendMessage"
    payload = {"chat_id": chat_id, "text": text}
    r = requests.post(url, json=payload)
    return r

@app.post("/")
def index():
    msg = request.get_json()
    chat_id = msg["message"]["chat"]["id"]
    inputText = msg["message"]["text"]
    if inputText == "/start":
        sendMessage(chat_id, "I am Online. You can send me a Prompt")
    else:
        BASE_URL = "https://lexica.art/api/v1/search?q=" + inputText
        response = requests.get(BASE_URL)
        response_text = json.loads(response.text)
        allImages = response_text["images"]
        sendMediaGroup(chat_id, allImages)
    return Response("ok", status=200)

app.run('0.0.0.0', 8080)

Let's go through the code and understand how it works:
- First, the variable inputText stores the text we send to our Telegram bot. By default, when we click on /start the inputText stores the string "/start". So for the default case, we just reply, "I am Online. You can send me a Prompt".
- In the other case, we will make a GET request to Lexica's API, searching for the prompt, and parse the JSON for the first 5 image URLs.
- We use the sendMediaGroup method of Telegram's API to send a group of 5 photos in the format specified as a dictionary in Python.
You can test your Telegram bot by entering any text. If everything went well up to this point, your bot will return the top 5 images from Lexica's search results.

Deploying the app to Vercel.

Open up your favourite code editor and initialize the following files and folders in the below order:

project
│───vercel.json   
│───requirements.txt
└───api
     │───app.py

In the app.py file. Copy the code as is from Replit. For your reference I have written it again below.

import os
import json
import requests
from flask import Flask
from flask import request
from flask import Response

app = Flask(__name__)
TOKEN = "Your API TOKEN from BotFather"

def imageAsDict(imageURL, caption):
    return {
        "type": "photo",
        "media": imageURL,
        "caption": caption,
    }


def sendMediaGroup(chatid, allImages):
    url = f"https://api.telegram.org/bot{TOKEN}/sendMediaGroup"
    media = [imageAsDict(allImages[i]["src"], allImages[i]["prompt"]) for i in range(5)]
    payload = {"chat_id": chatid, "media": media}
    r = requests.post(url, json=payload)
    return r

def sendMessage(chat_id, text):
    url = f"https://api.telegram.org/bot{TOKEN}/sendMessage"
    payload = {"chat_id": chat_id, "text": text}
    r = requests.post(url, json=payload)
    return r

@app.post("/")
def index():
    msg = request.get_json()
    chat_id = msg["message"]["chat"]["id"]
    inputText = msg["message"]["text"]
    if inputText == "/start":
        sendMessage(chat_id, "Ya, I am Online. Send me a Prompt")
    else:
        BASE_URL = "https://lexica.art/api/v1/search?q=" + str(inputText)
        response = requests.get(BASE_URL)
        response_text = json.loads(response.text)
        allImages = response_text["images"]
        sendMediaGroup(chat_id, allImages)
    return Response("ok", status=200)

(In the above code, I have hard coded the value of the Telegram token which shouldn't followed in general. You can use environmental variables and edit them in the Project settings on Vercel.)

In the requirements.txt, write

Flask
requests

The Web Server Gateway Interface (WSGI) is provided automatically by Vercel during the runtime.

The vercel.json file contains the configuration for this project.

A Sample Configuration for Flask is:

{
  "routes": [
    {
      "src": "/(.*)",
      "dest": "api/app.py"
    }
  ]
}

The above code routes any request to the original page to the Flask Server which was written in app.py.

Now deploy the Flask app onto vercel by running:

vercel deploy --prod

Follow the prompts asked, and Vercel will then build and deploy your app, and you will be provided with a URL where you can access your app. You can also access your website's URL by going to your Vercel dashboard.

The URL for my Flask app, which I have made using the above steps, is stable-diffusin-telegram-deployed.vercel.app.

Changing the Server/Webhook on Telegram.

https://api.telegram.org/bot<Your Bot Token>/setWebhook?url=<URL that you got from Vercel>

After entering the above URL in a browser, your Flask server on Vercel responds to all the requests to your Telegram bot.

Now you can send any prompt to your bot on Telegram, and it will reply with the top five images matching your text from Lexica. The bot I have built is available at t.me/stablediffusionsearch_bot.

In case if you are stuck anywhere, you can look at the code on my Github repo.

In case if you still have any questions regarding this post or want to discuss something with me feel free to connect on LinkedIn or Twitter.