How does Alexa skill work?

devkiran profile image Kiran Krishnan Originally published at ・3 min read

Alexa is a voice service from Amazon. Alexa powers devices like Echo Show, Echo Spot, Echo Dot etc. The most important feature of the Alexa is, of course, it's hands-free voice control, which lets users ask questions without touching the device. Alexa listen to the voice commands and respond with appropriate responses to fulfill.

Alexa is a superhuman assistant and can understand multiple languages such as English, French, German, Japanese. Thus Alexa can interact with millions of users from different countries without any language barrier problems.

Alexa comes with many capabilities like playing music, reading the news, creating your shopping list, setting alarms, know about the weather etc.

What will you do if you want to add a custom capability to Alexa?

For example, "let customers order food from your restaurant by voice command". You can do this by building a custom Alexa Skills. Alexa Skills are apps that give Alexa more new abilities and make her smart. Alexa Skills opens a new world of unlimited experiences to the users from playing games to booking cab and many more.

You can find the list of all Alexa Skills on Amazon website

When a user says 'Alexa, Ask Daily Horoscopes about Taurus'

The above command has 3 main parts: Wake word, Invocation name, Utterance.

Wake word
When users say 'Alexa' which wakes up the device. The wake word put the Alexa into the listening mode and ready to take instructions from users. Of Course you can change the wake word via Alexa assistant mobile app.

Invocation name
Invocation name is the keyword used to trigger a specific skill. It just likes the name of that skill which we used to identify. Users can combine the invocation name with an action, command or question. All the custom skills need an invocation name to trigger or start it.

In the above example, the keyword 'Taurus' is an utterance. Utterances are phrases the users will use when making a request to Alexa. Alexa identifies the user's intent from the given utterance and responds accordingly. So basically the utterance decide what user want Alexa to perform.

For example,

Tell Daily Horoscopes {I want my Taurus horoscope today}

Talk to Daily Horoscopes and {give me the horoscope for Taurus}

Let's see how the Alexa Skills work internally and how does it respond to users instructions.

Alexa enabled devices sends the user's instruction to a cloud-based service called Alexa Voice Service (AVS). Think the Alexa Voice Service as the brain of Alexa enabled devices and perform all the complex operations such as Automatic Speech Recognition (ASR) and Natural Language Understanding (NLU).

Alexa Voice Service process the response and identify the user's intent, then it makes the web service request to third party server if needed. This request is necessary if the Skills needs to fetch information from an external service. For example, Flash briefing skills need to interact with news website/server to fetch the latest headlines. From there, the Alexa service sends the response back to the Alexa and read it to the users.

