DEV Community

loading...
Cover image for How To Fill Out A Form With Your Voice

How To Fill Out A Form With Your Voice

madsstoumann profile image Mads Stoumann ・2 min read

One of my friends is a dermatologist. He has a very busy schedule, seeing up to 60 patients a day. In order to save time, he approached me with a request:

Can you help me make a form, where you fill out the fields using speech recognition? Is that possible?

Yes, indeed it is, but the SpeechRecognition API is currently only working in Chrome and Edge (according to MDN, it should also work in Safari 14.1 — but I haven't tested that).

To get started is pretty straight-forward:

window.SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition;
if (('SpeechRecognition' in window || 'webkitSpeechRecognition' in window)) { /* It's supporpted! */ }
Enter fullscreen mode Exit fullscreen mode

I've chosen to create a speech-object, that will hold all the stuff I need:

let speech = {
  enabled: true,
  listening: false,
  recognition: new window.SpeechRecognition(),
  text: ''
}

/* To allow to continously listen: */
speech.recognition.continuous = true;
/* To return interim results to a transcript area: */
speech.recognition.interimResults = true;
/* To set the language: */
speech.recognition.lang = 'en-US';
Enter fullscreen mode Exit fullscreen mode

The main eventListener takes the first result of an array of results — and, if the activeElement is either an <input> or a <textarea>, sets the value of that field to the transcript:

speech.recognition.addEventListener('result', (event) => {
  const audio = event.results[event.results.length - 1];
  speech.text = audio[0].transcript;
  const tag = document.activeElement.nodeName;
  if (tag === 'INPUT' || tag === 'TEXTAREA') {
    if (audio.isFinal) {
      document.activeElement.value += speech.text;
    }
  }
  result.innerText = speech.text;
});
Enter fullscreen mode Exit fullscreen mode

The toggle button is simply toggling a class, it's innerText, as well as triggering:

speech.recognition.start();
/* and */
speech.recognition.stop();
Enter fullscreen mode Exit fullscreen mode

Now, we're ready to click the ”Toggle listening”-button, focus on a form-field, and start talking. Go to this Codepen demo — remember to allow your microphone to be used.

speech

Pause a bit after a sentence, to allow the engine to process the audio and return a transcript.

There's a lot of room for improvement — maybe you could return a tag-cloud of transcripts, and then click-to-insert the text? What do you think?

To be honest, the SpeechRecognition API does feel a little bit shaky, but I'm sure it will improve in the future. I've tested with various languages, and can confirm it works pretty well with danish, english and lithuanian languages!

Thanks for reading!

Note: Due to browser security-restrictions, the Codepen demo doesn't work when embedded.

Documentation for the API at MDN

Discussion (2)

Collapse
ayeprahman profile image
Arif Rahman

Really nice article!

Collapse
madsstoumann profile image
Forem Open with the Forem app