DEV Community

Samuel Collier
Samuel Collier

Posted on • Edited on

Make A Simple Voice Assistant with JavaScript

Hey everyone!

Ever since I started programming, for some reason, I always thought it would be so cool to program my very own voice assistant. As it turns out, it's not that hard, and I'll show you how to very easily create one!

Disclaimer: the browser compatibility for this project has only been tested on Chrome, so there may be some compatibility issues on other browsers and mobile devices.

Okay, so first, let's start with a basic setup of our project. Let's create 3 files, index.html, style.css, and script.js. If you're using Replit for this project, which I highly recommend, these three files should already be created with the HTML/CSS/JS template.

The style.css and script.js file should be empty for now, but put this HTML snippet in the HTML file if it's not there already:

<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width">
    <title>Voice Assistant</title>
    <link href="style.css" rel="stylesheet" type="text/css" />
  </head>
  <body>

    <script src="script.js"></script>

  </body>
</html>
Enter fullscreen mode Exit fullscreen mode

Next, let's setup the frontend elements we need for this voice assistant. Since a voice assistant is mainly backend JS, we won't need much. We'll only need 3 elements:

  1. A button for the user to click to have the voice assistant start listening, with an id of "listen-button." When the user clicks on the button, we will call the function listen(), which we have not defined yet, but I'll talk about that later.
  2. An input area to display the speech-to-text text that we are speaking, with an id of "input-area"
  3. An output area to display the result of the voice assistant, with an id of "output-area"

We'll put all 3 of these elements inside a div, and the finished HTML file should look like this:

<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width">
    <title>Voice Assistant</title>
    <link href="style.css" rel="stylesheet" type="text/css" />
  </head>
  <body>

    <div id="main-container">
        <p id="input-area">...</p>
        <p id="output-area">...</p>
        <button id="listen-button" onclick="listen()">Listen</button>
    </div>

    <script src="script.js"></script>

  </body>
</html>
Enter fullscreen mode Exit fullscreen mode

Since the items are a little jumbled together with no styling, let's just put this basic piece of the code in the CSS file:

#main-container {
  text-align: center;
  border: 1px solid black;
  padding: 1em;
}
Enter fullscreen mode Exit fullscreen mode

This should be your result so far:

HTML finished page

I get that the page still looks trashy without proper CSS styling, but I'm not going to get into that in this tutorial, I'm sure there are plenty of CSS tutorials out there if you would like to make your voice assistant look better.

Now that the HTML is out of the way, let's get into the fun stuff: actually making the voice assistant work.

The first part of the voice assistant that we need is some way to get the computer to listen to us, receive microphone input, then turn that speech into text. This would normally be very complicated, but thankfully, we have an API (Application Programming Interface) that can do this very easily for us, called the Web Speech API.

So, to use this, let's first create a function in the script.js file, that we can call listen(). We'll call this function when the user clicks the Listen button that we created earlier in the HTML.

function listen() {

}
Enter fullscreen mode Exit fullscreen mode

Inside of that function, we'll create an easy way to access our HTML elements:

function listen() {
    let inputArea = document.getElementById('input-area')
    let outputArea = document.getElementById('output-area')
}
Enter fullscreen mode Exit fullscreen mode

And setup our speech recogntion:

function listen() {
    let inputArea = document.getElementById('input-area')
    let outputArea = document.getElementById('output-area')

    var recognition = new webkitSpeechRecognition();
    recognition.lang = "en-GB";
    recognition.start();
}
Enter fullscreen mode Exit fullscreen mode

Then, we will check for a result, and when the recognition gets a result, we'll store that data inside a variable called transcript and then display that data to the inputArea that we created in the HTML.

Here's what that would look like:

function listen() {
  let inputArea = document.getElementById('input-area')
  let outputArea = document.getElementById('output-area')

  var recognition = new webkitSpeechRecognition();
  recognition.lang = "en-GB";
  recognition.start();

  recognition.onresult = function(event) {
    let transcript = event.results[0][0].transcript;
    inputArea.innerHTML = event.results[0][0].transcript;
  }
}
Enter fullscreen mode Exit fullscreen mode

Now, let's run this program and see what happens. But please note: the program will not run properly in an iframe or something other that's not a browser window. The API needs to access your microphone through the browser, so please open it in a new tab.

Okay, so here's what should happen if you did everything correctly:

If you open project in a new tab and click the "Listen" button, you should get this prompt:
Microphone prompt

Click "Allow," and then try speaking! Say "Hello!"
The program should display the result like so:

Recognition result

Awesome! The program can show what we're saying on the screen! However, this is only half of the voice assistant. The voice assistant should take the information of what we said and then do something: reply to us, give us information, etc.

This is very easy to add! Since we have the text stored in the transcript variable, let's just create a simple if statement, let's say, to check if we said "hello," like this:

if (transcript == "hello") {
    outputArea.innerHTML = "Hello, User!"
}
Enter fullscreen mode Exit fullscreen mode

Now, we can place that code right here, in the the recognition.onresult function:

  recognition.onresult = function(event) {
    let transcript = event.results[0][0].transcript;
    if (transcript == "hello") {
      outputArea.innerHTML = "Hello, User!"
    }
    inputArea.innerHTML = event.results[0][0].transcript;
  }
Enter fullscreen mode Exit fullscreen mode

So, now if we say "hello," the program should output "Hello, User!"

Voice assistant output

This is great, but what if someone said, "Hello voice assistant," or something that included the word "hello"? Our voice assistant wouldn't understand, becuase it's only programmed to respond if the user says only "hello." However, JavaScript has a handy function called includes() that can check if a variable includes something. Thus, instead, we can do this:

 if (transcript.includes("hello")) {
      outputArea.innerHTML = "Hello, User!"
 }
Enter fullscreen mode Exit fullscreen mode

Now, if the user says something that includes the word "hello," the program will output "Hello, User!" Great, right?

Now, with what we've learned so far, let's create two more conditionals: one to give us the weather, and another one to alert us if the program doesn't know what we're trying to say, because currently, the program just does nothing if it doesn't know what we're saying.

For the weather conditional, we'll use an else if statement below the if statement, to open a weather website if the user wants the weather. We can do that like so:

if (transcript.includes("hello")) {
     outputArea.innerHTML = "Hello, User!"
} else if (transcript.includes("weather")) {
    window.open("https://www.google.com/search?q=weather") 
} else {
    outputArea.innerHTML = "I don't know what you mean."
}
Enter fullscreen mode Exit fullscreen mode

This voice assistant is really coming along! However, I'm going to end the tutorial here. There's still a lot of things you can do, though. Here's a list of features you can add!

Thanks for reading this tutorial, and I hope you learned something! Happy Coding!!

Top comments (2)

Collapse
 
hg0428 profile image
Hudson Gouge • Edited

This only works on Chrome, Edge, and Safari.
Opera, IE, and Firefox do not support this API

Collapse
 
stcollier profile image
Samuel Collier • Edited

Yes, here is the full list for browser compatibility:

dev-to-uploads.s3.amazonaws.com/up...