DEV Community


Posted on

Experimenting with Google’s Cloud Vision API + Intel Edison

(Note: these posts are migrated from my previous blog)

I first learned about Google’s Cloud Vision API at this year’s Google I/O. Though it’s been out in beta since 2015, I had not heard of it, nor had the chance to try it out till today. I came across this blog post and was intrigued by the YouTube demo:

As always, I have an Intel Edison lying around so I decided to give it a try.

Before you begin:

Make sure your Edison has been updated to the latest firmware and has Wi-Fi setup, use the setup/configuration tool found here to do so.

You will also need a Google Cloud account with the Vision API enabled. Follow these instructions here to do so before proceeding.

Things you’ll need:

  1. Intel Edison w/ Arduino Breakout Board (You could also use the mini breakout but you might need a USB adapter to connect a webcam

  2. Logitech C270 Webcam (Any other USB webcam supported by Linux UVC drivers would work too)

  3. Power Supply

Here’s how it’s all connected:

Note the position of the tiny switch; it is closer to the big USB port

Let’s go!

  1. For the USB webcam to work, make sure UVC drivers are installed and enabled; you can find instructions here on how to do that.

  2. Install ffmpeg. Git clone the edi-cam repository and run the shell script to install ffmpeg:

    root@edison:~# cd /edi-cam/bin
    root@edison:~# ./
    1. Install gcloud. This is the Google Cloud NodeJS module that allows you to easily use Google Cloud APIs.

      root@edison:~# npm install gcloud

    2. Copy over your service account key JSON created during setup (scp/sftp). You can create a new one here if you’ve lost it.

    3. Run the code! Copy & Paste this snippet into VIM or transfer the file over:

root@edison:~# node capture.js


Here’s the image that was captured by my webcam:

It’s a Hello Kitty Robot!

And here’s the returned JSON:

root@edison:~# node capture.js

[ { desc: ‘cartoon’, mid: ‘/m/0215n’, score: 85.945672 },

{ desc: ‘machine’, mid: ‘/m/0dkw5’, score: 74.98506900000001 },

{ desc: ‘robot’, mid: ‘/m/06fgw’, score: 69.911 },

{ desc: ‘gadget’, mid: ‘/m/02mf1n’, score: 67.246151 } ]

…I thought that was pretty cool :)

The Google Cloud Vision API actually has a lot of other powerful features, including analyzing emotional facial attributes, text extraction & detection, and detecting any [faces, landmarks, labels, logos, properties] in your images.

Vision capabilities perfectly complement robotic applications (e.g. a drone that tazes you if you’re not smiling, a spray paint bot that corrects graffiti grammar, etc.). I can’t wait to see what kind of cool things people will make with this!

Discussion (0)