DEV Community

Muhamed G
Muhamed G

Posted on • Originally published at devread.net on

Introducing AI Vision Assistant, an open source project to help people with visual impairments

Three weeks ago, I started to work on a side project to test and experience a new service from Azure called Azure Functions. The idea of the project was to create a Twitter account (bot) that will randomly get an image from 500px.com and describe it *caption it *using AI (Microsoft Cognitive Services) and tweet the image with the caption.

Simple idea! It worked as expected, you can check the account here: @aipics.

Inspiration…

A friend of mine saw the idea and told me that he knows a person with visual impairments in Twitter, and he’s asking people to describe tweets that contains images, he can hear the tweets (using the iPhone accessibility tools) but he can’t see the image content, his name is Mohammed. Then I thought to make a twitter bot for that purpose, we have the technology!

I started to implement that in a new Twitter account, I tested it and I asked Mohammed to test and provide ideas that could help more, And it was so awesome to work and talk to such a person, really!

The result!

I called it ‘AI Vision Assistant’ , you can check it here: @aivisionasst.

You can tweet the hashtag ( #aivision for English Or #وصف for Arabic) with an image OR reply to a tweet that contains image (Including Instagram links) and within 2 minutes, the account will reply with the caption or the description of the image, it supports 2 languages mainly, English and Arabic.

But unfortunately, the Arabic version is not 100% accurate as the Arabic language is not yet supported in Microsoft Cognitive Services APIs, but I used Bing translator to translate the English caption to Arabic one and then build the tweet text.

My favorite part, The account served more than 150 tweets (tweet = caption request), in fact, there’s some users that relay on this account and they’re using it almost in daily basis!

Maybe the numbers are low, but for me, seeing one person taking advantage of this account and get help as he needed, it’s like changing the whole world!

Technical details

There are two components mainly, described below:

archt-d-aivision

The architecture diagram of the complete idea.. 

Mentions Monitor:

A timer Function that will run every 2 minutes, to perform a search in Twitter for tweets that contains (#aivision OR #وصف) with these conditions: Tweeted within the last 3 minutes & Not yet handled.

If found any, the tweet id will be saved in a local log file (mark it as handled) and then it’ll call the second Function, the AI Vision Core.

AI Vision Core:

An Http-trigger Function, requires one parameter: tweet id.

Whenever this function receives a request, it’ll call Twitter API to get the image from the tweet object OR if it’s a reply, it’ll get the image from InReplyToTweet object OR it’ll search for an Instagram link if any.

Then it’ll prepare the image, call the Vision API to get the caption, translate it if the hashtag used was Arabic, build the tweet text and finally reply with the image description with any mentions!

Open source…

This project is an open source project , you can browse it on Github: /AIVision.

Please feel free to pivot it, provide it in your language/ culture or contribute in making it better.. or at least, I hope It’ll inspire you!

We have the technology

According to WHO, there’s more than 39 Million blind people in the world… this project might help few individuals in Twitter, but still, there are millions out there.  Imagine what help or support that could be provided by us, programmers!

Thank you

Muhamed – @mmg_rt.

This post was originally published on devread.net

Top comments (0)