Manuel Kanetscheider

Posted on Apr 17, 2022

Serverless Thumbnail Generation with Azure Computer Vision

#azure #webdev #serverless #tutorial

Introduction

In this guide, I would like to introduce a pre-build AI service called Azure Cognitive Service, in particular Azure Computer Vision with the Thumbnail Generation API.
As surely known a thumbnail is simply a smaller version of the original image, in web applications thumbnails are used in many places like for displaying product images etc. Because of the fact that thumbnails are smaller they need less bandwidth than the original images and speed up the loading of websites.

Azure Computer Vision is part of Azure Cognitive Services. But what are Azure Cognitive Services? At a glance, Azure Cognitive Services offers the following features:

Speech
- Speech to text
- Text to speech
- Speech translation
- Speaker recognition
Languages
- Entity recognition
- Sentiment analysis
- Question Answering
- Conversational language understanding
- Translator
Vision
- Computer Vision
- Custom Vision
- Face API
Decision
- Anomaly Detector
- Content Moderator
- Personalizer
OpenAI
- OpenAI Service

In a nutshell, Azure Cognitive Services offer various pre-build AI services, one of which is Azure Computer Vision. Azure Computer Vision offers much more functionality than just thumbnail generation, but we'll leave that out for now (this maybe a topic for another blog post 😀).

For more details on Azure Cognitive Services, please checkout the
official documentation

Why should we use Azure Computer Vision for Thumbnail Generation?

Creating thumbnails can sometimes be quite challenging and there are various tutorials for countless programming languages that cover this topic. Azure Computer Vision does the following things for us when creating the thumbnail:

Remove distracting elements from the image and identify the area of interest—the area of the image in which the main object(s) appears.
Crop the image based on the identified area of interest.
Change the aspect ratio to fit the target thumbnail dimensions.

Especially the Smart Cropping, i.e. the detection and removal of distracting elements, is an extremely useful feature. Imagine you implement your own thumbnail creation algorithm and the center is always cropped out, but the area of interest is outside the center. So the algorithm would crop something that is actually relevant and that's where Smart Cropping comes to the rescue!

Prerequisites

Azure Account with an active subscription. In case you do not have an Azure Account, go ahead and create one for free here.
Azure CLI, you can either install the Azure CLI locally or use the Azure Cloud Shell.

Let's get started

Computer Vision Deployment

Create a resource group:

 az group create -l <location> --name <name>

Create the Computer Vision resource:

az cognitiveservices account create \
    --name <computer-vision-resource-name> \
    --resource-group <resource-group> \
    --kind ComputerVision \
    --sku F0 \
    --location <location> \
    --yes

ℹ️ For the pricing tier, instead of F0 (= free tier) you can also specify S1 (= Standard S1).

⚠️ Only one free tier can be deployed at a time.

Grab the endpoint URL of your Computer Vision service via the Azure Portal:

Retrieve the access keys:

az cognitiveservices account keys list \
--name <computer-vision-resource-name> \
--resource-group <resource-group>

ℹ️ In case you want to regenerate or rotate your keys, you can use the following command:

az cognitiveservices account keys regenerate \
--key-name <Key1, Key2> \
--name <computer-vision-resource-name>
--resource-group <resource-group>

Generating of Thumbnails via the Azure Computer Vision API

Requirements and limitations of the Thumbnail API

Before we start calling the endpoint, here are some image requirements and limitations:

Image file size must be less than 4MB
Image dimensions should be greater than 50 x 50
Supported image formats
- JPEG
- PNG
- GIF
- BMP

How to call the API

Request Parameters:

Name	Type	Description
width	number	Width of the thumbnail. It must be between 1 and 1024. Recommended minimum of 50.
height	number	Height of the thumbnail. It must be between 1 and 1024. Recommended minimum of 50.
smartCropping (optional)	boolean	Boolean flag for enabling smart cropping.
model-version (optional)	string	Optional parameter to specify the version of the AI model. The default value is "latest".

Request Header:

Name	Type	Description
Content-Type	string	Media type of the body sent to the API.
Ocp-Apim-Subscription-Key	string	Subscription key which provides access to this API

Request Body:

The Thumbnail API supports the following content types:

application/json
application/octet-stream
multipart/form-data

application/json:

{"url":"http://example.com/images/test.jpg"}

application/octet-stream:

[Binary image data]

multipart/form-data:

[Binary image data]

ℹ️ The above section is an excerpt from the official documentation. For more details, please checkout the official documentation.

Example requests

application/json:

curl --location --request POST 'https://<computer-vision-resource-name>.cognitiveservices.azure.com/vision/v3.2/generateThumbnail?width=200&height=200&smartCropping=true&model-version=latest' \
--header 'Content-Type: application/json' \
--header 'Ocp-Apim-Subscription-Key: <api-key>' \
--data-raw '{
    "url":"http://example.com/images/test.jpg"
}'

application/octet-stream:

curl --location --request POST 'https://<computer-vision-resource-name>.cognitiveservices.azure.com/vision/v3.2/generateThumbnail?width=200&height=200&smartCropping=true&model-version=latest' \
--header 'Content-Type: application/octet-stream' \
--header 'Ocp-Apim-Subscription-Key: <api-key>' \
--data-binary '@<path-to-image>'

multipart/form-data:

curl --location --request POST 'https://<computer-vision-resource-name>.cognitiveservices.azure.com/vision/v3.2/generateThumbnail?width=200&height=200&smartCropping=true&model-version=latest' \
--header 'Content-Type: multipart/form-data' \
--header 'Ocp-Apim-Subscription-Key: <api-key>' \
--form 'img=@"<path-to-image>"'

For my example requests I used the following image:

ℹ️ The image is taken from Unsplash. Unsplash is really awesome and offers freely-usable images. You should definitely check it out!

This is the image with Smart Cropping:

And this is the image without Smart Cropping:

As you can clearly see in the picture without Smart Cropping parts of the keyboard are missing, as well the right arm was cropped away. Also in the left corner the plant is still visible. In the picture with Smart Cropping the keyboard, as well as both arms, are nicely visible and the plant in the left corner has been completely removed.

Pricing

As you could hopefully see, Azure Computer Vision thumbnail generation is extremely easy to use and offers useful features like smart cropping. But how much does the API cost?

Azure Computer Vision is a fully managed serverless service and pricing is "pay-as-you-go", in addition Microsoft offers a free tier for development purposes. The Computer Vision service for thumbnail generation is very affordable, 1000 transactions costs around $1.

⚠️ Disclaimer: This is an example configuration. Other prices may apply depending on configuration and region. For more precise estimation, I highly recommend using the Azure Pricing Calculator.

ℹ️ For enterprise scenarios with high volumes, Azure also offers a Committed Tier, where fixed contingents can be purchased at a reduced price. For more details, checkout the official pricing page for Azure Computer Vision.

Conclusion

I hope I could bring you a little bit closer to Azure Computer Vision for thumbnail generation. It really is an incredible service that greatly simplifies thumbnail creation and can make a developer's life immensely easier with features like Smart Cropping. Additionally, the service doesn't cost a fortune either 😉

Thanks for reading, let me know your thoughts on this feature!

Top comments (1)

Diego (Relatable Code) • Apr 19 '22

This is a pretty cool service. Think I might play around with the free tier later.

DEV Community