Create account

DEV Community

Gourav Singh Rawat

Posted on May 13, 2022

Computer vision API - Using Microsoft Azure Cognitive services

#tutorial #azure #machinelearning #node

Cognitive services

Cognitive Services are a set of machine learning algorithms that Microsoft has developed to solve problems in the field of Artificial Intelligence (AI). The goal of Cognitive Services is to democratise AI by packaging it into discrete components that are easy for developers to use in their own apps.

I recently created an Application - Azura with same method.

Seek4samurai / Azura

Yes.! Azura Play with it. Powered by Microsoft's @Azure-cognitive-service-computer-vision. It's available in both as web application and as a browser extension.

Azura

Yes.! Azura

What is Azura?🚀

This is an extension just like those we put on our browsers and also a sort of searching tool, that takes an Image url as input and processes it using Microsoft Azure's Computer vision and describes what the image is about. This is basically a tool that exists to describe the one use of Computer vision

Live demo 🌏

Website is live at https://azura-website.vercel.app/
But do check the extension as well with even better user experience and with text to speech feature that reads out the description of the image.

How to use is as extension 🧑🏼‍💻

Clone or download it as zip, the following repository : https://github.com/seek4samurai/azura

Adding to your browser 📝

To add this extension, go to your browser >> extensions

First you need to turn on the Developer mode: On.

Once this is done, you can now import extensions

Click on…

View on GitHub

If you're familiar with Computer vision, you must know how it works. This is a technique in which we train a machine's vision to recognise real world objects and similar things, which could either be some objects or even living things like human face or recognising animals.

Microsoft Azure provides with some free to use cognitive service APIs to create such computer vision powered applications.

Getting started

Creating Azure resource

Select Computer vision from resource and then create a resource.

After you've created a resource.

Using API client
Once you did all before steps correctly, you can get started with your workspace.

Server setup
Get started with creating a server using, we are using nodejs npm init -y. Once you've initialised, you've to install following packages and libraries.

{
  "name": "azura-backend",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {
    "dev": "nodemon ./src/index.js",
    "start": "node ./src/index.js"
  },
  "keywords": [],
  "author": "",
  "license": "ISC",
  "dependencies": {
    "@azure/cognitiveservices-computervision": "^8.1.0",
    "cors": "^2.8.5",
    "dotenv": "^16.0.0",
    "express": "^4.17.2"
  },
  "devDependencies": {
    "nodemon": "^2.0.15"
  }
}

Here, we are using Express for creating server. And to use the Azure-cognitive services we install
npm i @azure/cognitiveservices-computervision

Create a src folder and index.js file to start a server and handle basic routes in it.

const express = require("express");
const dotenv = require("dotenv");
const cors = require("cors");

dotenv.config();

const imageController = require("./controller");

const app = express();
app.use(cors({
  origin: "*"
}));
app.use(express.json());

// Routes
app.use("/", imageController);

const PORT = process.env.PORT || 5000;

app.listen(PORT, () => {
  console.log(`App running or port ${PORT}`);
});

Once this is done, create controller.js file, where we will use computer vision client for our application.

const express = require("express");
const ComputerVisionClient =
  require("@azure/cognitiveservices-computervision").ComputerVisionClient;
const ApiKeyCredentials = require("@azure/ms-rest-js").ApiKeyCredentials;

const router = express.Router();

router.post("/describe", async (req, res) => {
  const KEY = process.env.KEY;
  const ENDPOINT = process.env.ENDPOINT;

  // Create a new Client
  const computerVisionClient = new ComputerVisionClient(
    new ApiKeyCredentials({ inHeader: { "Ocp-Apim-Subscription-Key": KEY } }),
    ENDPOINT
  );

  if (!req.body.imageUrl) {
    return res.send("Image url is not set! Please provide an image!");
  }
});

module.exports = router;

Remember you've to create .env file in your local workspace and paste your API keys and endpoint, and to use them I'm using dotenv package (hope that is understandable). We've initialised the client and when we hit the post request to redirect to /describe, it should hit our client. You can try using postman to check this API call.
And in the last line we are just checking if the request is empty, then simply return that empty url message.

After all this we can go ahead and create a try-catch block and use the

  try {
    // Describe and Image Url
    const descUrl = req.body.imageUrl;
    const caption = (await computerVisionClient.describeImage(descUrl))
      .captions[0];

    res.send(
      `This maybe ${caption.text} (confidence ${caption.confidence.toFixed(2)})`
    );
  } catch (error) {
    console.log(error);
    res.send(error.message)
  }

Here, we are getting the req.body.imageUrl from our frontend and using that URL for our client. And it will return and send response back to frontend.

Frontend overview

Since frontend is not the point of focus in this tutorial, we can take a quick overview of it.
We take input from user and that URL is sent to our backend. I'm using Axios for that purpose.

const res = await axios.post(
      "https://YourURL/describe",
      {
        imageUrl,
      }
    );

In place of YourURL, paste your server's URL.

You can check to print the response or log it in console. This will accept image's URL and return it's description, what the image is about.

Thank you for reading.