Over the years, human-computer interaction (HCI) has steadily evolved to allow more intuitive and natural interactions between human and machine. Before the late 1970’s, you needed to be a computer specialist who used text commands to interact with computers. Then, the introduction of graphical user interfaces (GUIs) allowed people who weren't specialists to interact with computers, using visual elements such as icons and menus. Today, the possibilities for human-computer interaction continue to widen with the growing use of Voice User Interfaces (VUIs).
VUIs allow people to interact with computers using voice commands. VUIs are enriched by conversational artificial intelligence (AI), which help make our voice interactions feel as natural as conversations. There are many implementations and use cases of conversational AI, in industries ranging from healthcare to finance. Amazon’s Alexa is one popular implementation of conversational AI.
Today, we’ll discuss how you can contribute to a new generation of voice apps by building your own Alexa skills for Amazon Alexa. Whether you want to be a full-time Alexa developer, build a project to impress at Amazon interviews, or create fun tools to share with your family and friends, this is a great place to start.
- Building Alexa skills 101
- Best practices of voice design
- Basics of building Alexa skills
- Additional considerations
- Wrapping up and next steps
Alexa is a virtual assistant that comes with pre-built functions such as weather updates and alarms. You can also create new Alexa skills that users can find and enable on the Alexa Skills Store. Your skills will change the way people interact with smart speakers such as the Echo Dot, or smart home devices like the Echo Show.
You don’t need to be a specialist in conversational AI to develop new Alexa skills. There are various tools and software development kits (SDKs) available to support Alexa skill development. The official SDK for Amazon Alexa skills is the Alexa Skills Kit (ASK). ASK provides self-service APIs as well as other tools that you can use to create your own skill in the Amazon developer console. This includes Alexa skill blueprints, which offer templates for common skills like trivia and games. Alternatively, you can also “Start from scratch” as a template.
To host your backend, you have the convenient option to use Alexa-hosted skills or to provision your own resources. If you opt for Alexa-hosted skills, you can select between Python or Node.js. You’ll also be able to deploy directly to AWS Lambda from the Amazon developer console.
Here’s what you’ll need to start building your own custom Alexa skills:
- Knowledge of web basics (e.g. JSON, HTTP protocols, etc)
- Amazon developer account
You’ll also need to know some voice design best practices and Alexa basics, but we’ll cover some of that next.
Compared to apps built for the screen, voice-powered apps have their own set of best practices. Voice services should be inspired by qualities of real-world conversations to ensure an intuitive and engaging user experience.
There are four design principles that Amazon recommends following for building Alexa skills.
People use different words to express the same ideas or intentions. When processing a person’s response, a voice application should be able to consider the many different paths that can yield the same results.
Your voice assistant is sort of a robot, but it’s serving humans. As such, it should strive toward more human standards. If someone responded to you with the same phrases, you may quickly lose interest in the conversation. To keep the user experience engaging, you’ll need to personalize your voice app to have a variation of responses.
Voice services should make it clear what options are available to their users. However, you don’t want to overload users with an entire menu of options. Being available means you offer users their available options, while keeping things as simple and brief as possible.
Just like conversations with humans, you want your voice app to feel human and relatable. To do this, we need to make sure that the app’s responses sound natural and conversational. Some of the strategies to help achieve this are using contractions (e.g. “can’t” instead of “can not”) and transition words (e.g. “first”).
To start building your custom Alexa skill, your interaction model will be defined with the following components:
- Wake word: A wake word is the word that wakes your virtual assistant. In the case of Alexa, it’s “Alexa.”
- Invocation name: Once your virtual assistant is activated and listening, an invocation name tells Alexa which skill you want to open.
- Utterances: Utterances are the statements or questions you express to Alexa. Since many different utterances can express the same thing, variation of utterances should be expected, per the “Be adaptable” design requirement.
- Intent: Intent is the action that Alexa should take in response to the utterance.
- Slot: Slots are variable inputs in utterances. They’re useful for cases like numbers, it’s incredibly tedious and unreasonable to have to enumerate all possible utterances one could tell Alexa (e.g. zero to infinity).
ASK has made the Alexa skill building process rather streamlined. However, there are still several steps you’ll need to include in your building process to ensure a good quality voice service.
Your building process will involve several steps, including:
- Planning (User experience, intents, etc)
- Creating the VUI
- Coding the backend logic
- Testing (From Amazon developer console or device)
- Store listing and certification
- Making improvements based on user feedback
Remember, your skill is providing a user experience. The better it is at anticipating your users, the more successful it'll be. You’ll want to moderate all of your design choices by prioritizing the user experience. Let’s consider some of the thought that goes into the building process by focusing on scriptwriting for the user experience.
When you’re building a skill from scratch, the building process will involve writing scripts. Scriptwriting can be fun. Keeping in line the voice design principles, you can personalize your script to give Alexa a bit of personality. However, be sure you don’t lose sight of what’s best for the user experience. Even if you get creative, Alexa’s interactions should still be relevant and reasonably brief.
When writing scripts, you certainly plan for the happy path, which is the ideal interaction you imagine between a user and your skill. However, you should also plan for edge cases. Edge cases are everything outside of the happy path. They may be what you’re not expecting, but they’re something you should plan for. To prepare for edge cases, you’ll need to consider the various types of utterances that your virtual assistant can be trained for, and which of them require a specific rather than general response.
Some edge cases may be questions that are entirely irrelevant, while others may be a help case in which your user may need assistance. For instance, if they ask Alexa, “What’s your favorite fruit?” it’s not a bad idea for Alexa to give a general response such as, “Sorry, I’m a trivia game. Tell me when you’re ready to start playing.” But if the user asks, “Can you help me?” it might be helpful to prepare a specific response that clarifies something for the reader, such as, “This is a trivia game. Tell me when you’re ready to start playing, or say ‘Stop’ to exit.”
It’s always exciting to gain new skills, and learning to build voice apps is no exception! Building Alexa skills will open the door for you to experiment with next-gen voice-powered apps. Whether you become a dedicated Alexa developer, or make a hobby out of creating skills, there’s a lot you can do with the help of the Alexa Skill Kit.
For a deeper exploration of building Alexa skills, check out our course Alexa Skills 101: Building voice apps for Alexa. You’ll learn more about voice design and how VUIs can enhance your apps. The course includes a step-by-step tutorial on the Alexa skill building process, quizzes to test your knowledge, and a hands-on mini project.
- Bytesize: 3 unusual uses of AI
- Deep learning vs machine learning: Deep dive
- Learning about Alexa skills from a solutions architect
What Alexa skill are you building? Was this article helpful? Let us know in the comments below!