DEV Community

Cover image for Speech Recognition With Javascript
Asaolu Elijah 🧙‍♂️
Asaolu Elijah 🧙‍♂️

Posted on

Speech Recognition With Javascript

Introduction

Speech recognition refers to the the process of enabling a computer to identify and respond to the sounds produced in human speech.
It was first introduced at Bell Laboratories in 1952 and this version could only recognize numbers but not words. Few years later, speech recognition had grown from just recognizing numbers to recognizing text, grammers and even detecting noise.
This technology was developed as an alternative to typing on keyboard, you will only have to talk to your computer and your words appear on your computer screen

Web Speech Api

In year 2012, the Web Speech Api was introduced with the aim of enabling speech recognition and also converting text to speech on modern web browsers.

Note: Speech recognition is not currently supported on all browsers click here for list of compatible browsers.

Getting Started

The first thing we need to do is to check if our browser is compatible with speech recognition, we can easily achieve that with the code below 👇

The next step is to create a new speech recognition object and check for when recording start.

Finally to start our speech recognition and do something with the output.

Code Explanation:

  • recognition.onstart : This is an event handler that will run when the speech recognition service has begun listening to incoming audio.
  • recognition.onresult : Another event handler that will run when the speech recognition service returns a result.
  • recognition.start() : This method will start the speech recognition service and start listening to incoming audio, running this code for the first time will show a dialog asking for access to your device microphone like below.


    Speech Recognition, Allow microphone access dialog

  • transcript : This is the text output generated after the speech recognition service had stopped, and that's all we need from all the code we've written so far. For now, we are just logging the output to the console, you can choose to do something else with it.

There are more properties, methods and event handlers that can be used when accessing the speech recognition objects, some of which include:

  • recognition.grammars : Used to set the grammars that will be understood by the speech recognition service.
  • recognition.continuous : Boolean to set whether continuous results are returned for each recognition, or only a single result.

Click here for full list of supported methods, properties and event handlers.

Sayit 🗣

I'd recently built a progressive web app (utilizing speech recognition)
that convert spoken word to text and provide a button to instantly share this text across various social media platform.
This project could be handy when you want to send a lengthy email or post on social media.
View the project live here and if you think its cool, kindly give a star on github (contributions are also welcome 🤗).

Conclusion

+1 for Accessibility

Speech recognition had played a great role in accessibility over the past few years, most especially for the visually impaired, people with injured arm and many more. Since they cannot use the keyboard for typing, they'd to default to using their voice for controlling and navigating through applications and web pages.


Project Idea

If you are so into speech recognition (like i am), how about building a web pages that is fully automated and controlled with voice rather than clicking or swiping. For example from the index page, i could just say go to about page, and i will be redirected to about page, sounds cool? yeah!. I will love to see what you've built, you can send me a message on twitter, i will gladly answer your questions.


P.s: i'm looking to make new dev friends 🤗, lets connect on twitter.

Thanks for reading 👏

Top comments (11)

Collapse
 
patricnox profile image
PatricNox

Nice! Very neat and simple.

Should include an practical example of where you show the results given through a test.

Collapse
 
asaoluelijah profile image
Asaolu Elijah 🧙‍♂️

Thanks Patrick, I'd just added a sub section explaining the result.
And here i showed a practical example implementing speech recognition.

Collapse
 
mrmalik16 profile image
Sharjeel Faiq

This is a great post! Can you show me how to use this in react js functional component. I don't really have an idea how to capture my text area's value inside a js variable.

Below is the onclick() function that I'm using to invoke record() function inside a button in reactjs.

Speech to text

const record = () => {
window.recognition.onresult = function(event) {
console.log(event);
let output = document.getElementById("output");
output.innerHTML = "";

  for (let i = 0; i < event.results.length; i++) {

      output.innerHTML = output.innerHTML + event.results[i][0].transcript;

  }
Enter fullscreen mode Exit fullscreen mode

}
window.recognition.start();
}

Collapse
 
olawanle_joel profile image
Joel Olawanle

This is nice sir

Collapse
 
asaoluelijah profile image
Asaolu Elijah 🧙‍♂️

Thank you sire ☺️

Collapse
 
deta19 profile image
mihai

can it generate sounds, of different length ways?

Collapse
 
asaoluelijah profile image
Asaolu Elijah 🧙‍♂️

Not really Mihai, this is basically to recognize spoken words and not generate. If you are looking to convert text to speech, here is a great tutorial for that.

Collapse
 
tsahay2 profile image
Tushar Sahay

Please visit covid.tusharsahay.com to see a practical example where I have used this API. Let me know how you find it! :-)

Collapse
 
asaoluelijah profile image
Asaolu Elijah 🧙‍♂️

Wow, this is amazing Tushar.
Really nice!
Keep up the good work 🎉

Collapse
 
asaoluelijah profile image
Asaolu Elijah 🧙‍♂️

Glad you found it super useful ❤️

Collapse
 
muteenk profile image
muteenk

Does anyone know how to use the API in Firefox browser, the onresult event function is not executing for some reason