Sometimes during exercise, I want to track how long I’m able to hold the plank position or a certain stretch. However this situation makes it awkward to use my phone and this inspired me to explore Web Speech API and make voice controlled stopwatch in react. In this article, I’m focusing on adding voice commands to the existing react tiny project. You can follow along the codebase on github
Speech Recognition in browsers
The SpeechRecognition interface is part of the experimental Web Speech API and it makes it easy to implement voice user interface (VUI) in browsers. Official browser support looks great, however in practice, it is less optimistic. According to my tests with multiple browsers, it has been working reliably only in Chrome and Safari.
Tiny project with voice user interface
The only thing I want to render is the Timer
component inside the App
where I will add controls using voice commands. This is what the initial structure looks like:
export function App() {
const lastCommand = useVoiceCommands()
return (
<Timer run={true} />
);
}
// Placeholder for adding voice commands
function useVoiceCommands() {
return "";
}
Note the placeholder hook useVoiceCommands
which will be used to add - you guessed it, voice command functionality.
Lucky for me there is already a package available for speech recognition in react called react-speech-recognition.
This package offers an option to add 3rd party recognition service as a polyfill for full cross-browser support. For this project, I'm happy with Chrome and Safari support so I will not investigate this option.
Custom useVoiceCommands
hook
Let's start building our useVoiceCommands
hook by importing the building blocks from react-speech-recognition
.
import SpeechRecognition, { useSpeechRecognition } from "react-speech-recognition";
react-speech-recognition
package exports useSpeechRecognition
hook to read recognized speech in our component, and SpeechRecognition
object to be used to start and stop the recognition service. I want to start the speech recognition when app renders so I will call it inside React.useEffect hook with an empty dependency array, including abortListening
method in the clean up function.
function useVoiceCommands() {
React.useEffect(() => {
// Timeout fixes the error during live-refresh in development when listening
// attempted to start before it has been aborted in a cleanup
setTimeout(
() => SpeechRecognition.startListening({ continuous: true }),
500,
);
// Stop listening when component is unmounted.
return () => {
SpeechRecognition.abortListening();
};
}, []);
return "";
}
Let’s define the commands for the stopwatch:
start
- start the timer
pause
- pause at the current time
stop
/ reset
- stop and set the stopwatch back to zeroes.
These words will have to be recognized and used in the App
to interact with the timer. At the moment our app starts listening to microphone input but we don’t read the output yet so let’s do this next. Add the useSpeechRecognition
hook imported from react-speech-recognition package into the custom useVoiceCommands
hook. Its output is an object containing properties indicating browser support and transcribe details.
If you want to learn more, I recommend consulting the API docs
In this use case, we will want to trigger state change when specific words are recognized so we can ignore the transcript properties and disable state update in useSpeechRecognition
hook by adding a configuration object into its call with transcribing set to false:
useSpeechRecognition({ transcribing: false })
💡 Everything indicates browser support speech recognition in all major browsers (Firefox behind flag) but at the time of writing, it is operating correctly only in Chrome and Safari
React speech recognition has an excellent feature for exactly what we need. It is called “commands” and we can use it to define specific words we want to listen for, with fuzzy match and callback to be triggered upon recognition. In the command array, we put all the commands we want to be recognized, and as a callback, we will set the state value of the lastCommand
. This is what the final version of the custom useVoiceCommands
hook looks like:
function useVoiceCommands() {
const [lastCommand, setLastCommand] = React.useState("");
useSpeechRecognition({
transcribing: false,
commands: [
{
command: [
"start",
"stop",
"reset",
"pause",
"pose" /* pose - catches misunderstood pause */,
],
callback: setLastCommand,
matchInterim: true,
isFuzzyMatch: true,
},
],
});
React.useEffect(() => {
// Timeout fixes error during development when listenning attempted to start
// before it has been aborted in a cleanup
setTimeout(
() => SpeechRecognition.startListening({ continuous: true }),
500,
);
return () => {
SpeechRecognition.abortListening();
};
}, []);
return lastCommand;
}
Using the voice commands in app
By using the custom hook for getting the voice command we are keeping the body of the App function component concise and clean. This component will make use of lastCommand
in the following ways.
The lastCommand
state variable will be used to conditionally set the run
prop of timer. If its value is start
- the run
will be true which starts the timer. When we say pause, run
prop will change to false and timer will pause displaying the time that has passed since start.
Same thing actually happens when you say "stop" or "reset", but for these commands we will set the time back to zeros.
In order to reset the Timer back to zero, we will force the component to re-initialize by adding the key
prop and change its value when either “stop” or “reset” commands are recognized.
export function App() {
const [resetKey, setResetKey] = React.useState(0);
const lastCommand = useVoiceCommands();
React.useEffect(() => {
if (["reset", "stop"].includes(lastCommand)) {
setResetKey((x) => x + 1);
}
}, [lastCommand]);
return (
<Timer key={resetKey} run={lastCommand === "start"} />
);
}
Changing the value of the key
prop will force react to unmount the existing instance and mount new one which coincides with our goal to reset the timer.
Now the app utilizes all of the commands we set to use for this tiny project. Remember when you test this, that indicated browser support for SpeechRecognition API is not reliable and it is best to run this project in Chrome. At the moment react-speech-recognition does not reveal the error state (I have submitted PR for this).
To help you check the support for your browser I have added an error listener on the demo of this project along with the display of debugging information. After you enable microphone permissions and start talking it will either display recognized commands or the error code on the screen, so you don’t end up talking to your computer for no reason 🙂
Hopefully, this tiny project walkthrough gave you an idea of how to start adding a voice user interface to your projects. Let me know in the comments all your questions or your feedback on this walkthrough or what you are going to use the speech recognition for. If you see something that could be done better feel free to submit a PR and I will be happy to post an update.
Useful resources for your projects using speech recognition:
This project on github and live demo
Cover photo by Jason Rosewell on Unsplash
Top comments (1)
Checkout our plug and play Voice User Interface for React github.com/sista-ai/ai-assistant-r...