DEV Community

Cover image for Object detection with Azure Custom Vision
Daniel Gomez
Daniel Gomez

Posted on

Object detection with Azure Custom Vision

Object detection in images allows us to know the coordinates of one or multiple objects (or labels) previously established.

In this tutorial, we will learn how to create a computer vision model with the Azure Custom Vision resource, for the recognition of objects (Bottles, Faces, Guitars, and Hats) in images.

Important things:

To customize our model, this will be the process we must follow:

Access to Custom Vision: https://www.customvision.ai/

Sample image dataset: General Dataset.

Previous process: Create a Custom Vision project.

First, we will need to log in to the Custom Vision Low-Code portal with our Azure account credentials, and create a new Object Detection project.

Note: We will need an Azure Custom Vision resource. This resource can be created from the Azure portal, or from this configuration wizard directly.

Step 1: Upload images.

As a first step, the first thing we will do is load the images (preferably in order, according to the objects that we are going to tag for object detection).

By selecting the Add Images option, we can select the images we want to upload. In this example, we will load the images to tag hat objects:

Once the image upload is finished, we can see something like this:

Now, with each image loaded, we must open each image, and select the area where the object we want to tag is located.

Select object in the image

Note: in an image we can select one or more areas to tag several objects.

Image areas

In order to complete this image tagging process, it is important that we have tagged at least 15 images for each category. For our example, this is the number of images we have used on each tag: Bottle - 18, Face - 26, Guitar - 18, and Hat - 19.

Step 2. Train the model.

Now that we have the most important thing (as in any other Machine Learning model) – the data, now we can do the training from this option:

Train option

Here we can consider two types of training:

On the one hand, we can perform the training in the shortest possible time according to the number of images that have been uploaded and the number of tags we have; And on the other hand, we can do an advanced training, in which we can find the best possible model, considering a maximum time that we can specify:

When the training is finished, we can evaluate the model.

Step 3. Evaluate the model.

In this step, in the Performance section, we can analyze the model with three metrics: Precision, Recall, and mAP (Mean Average Precision).

General performance

Performance per object

In general, these metrics help us analyze the following:

Accuracy: indicates the fraction of identified images that were correct. For example, if the model identified 100 images as hats and 99 of them were actually hats, the accuracy would be 99%.

Recall: indicates the fraction of actual classifications that were correctly identified. For example, if there were actually 100 images with hats and the model identified 80 as hats, the recall would be 80%.

mAP: is the mean value of the mean accuracy (AP). AP is the area under the accuracy/recovery curve (accuracy plotted versus recovery for each prediction made).

Quick test:

With our Azure Custom Vision model ready, from the Quick Test option we can make a quick example:

Quick Test option

Here we can upload an image from our computer or use the link of an online image. Likewise, we can establish the probability threshold, that is, show all the detected objects that have a probability greater than the one we specify.

Example: in the previous image we can see that there is a 99.8% probability that a Guitar is located in the specified area.

Plus: Publish the model.

Now that we have an established and tested Custom Vision model, we can now publish it from the Performance section, and select an Iteration/Training.

To publish the model, we must specify a name, and the Custom Vision resource in Azure that allows us to make evaluations:

When publishing the model, we can use this resource as a web API:

We could also export the model considering TensorFlow, CoreML, Docker container, among others.

Here we can learn more about it: Export your model for use with mobile devices.

Thanks for reading!

If you have any questions or ideas in mind, it will be a pleasure to be able to be in communication with you, and together exchange knowledge with each other.

See you on Twitter / esDanielGomez.com.

Top comments (0)