DEV Community

Dilek Karasoy for Picovoice

Posted on

React Native Speech Recognition Tutorial

Native apps of Google and Apple process voice data on the device. However, they do not offer it to other developers. Luckily, Picovoice does. On day 40, we'll go over how to process voice data on device by using Picovoice React Native SDK. Picovoice SDK combines Porcupine Wake Word and Rhino Speech-to-Intent engines, enabling commands like "Alexa, set timer for 5 minutes" but even better! We'll use a custom hotword instead of Alexa, and voice commands will be processed with zero latency.

  1. Install the latest Picovoice packages:
npm i @picovoice/react-native-voice-processor
npm i @picovoice/porcupine-react-native
npm i @picovoice/rhino-react-native
npm i @picovoice/picovoice-react-native
Enter fullscreen mode Exit fullscreen mode

2.Initialize the Speech Recognition Platform
First, grab your AccessKey for free from the Picovoice Console.
To keep things straightforward, we’re going to use the pre-trained hotword model Pico Clock and the pre-trained context model Clock for this tutorial. You can download pre-trained models here. However, you can also train custom wake words and contexts on the Picovoice Console.

Now you should have an AccessKey, a Porcupine model (.ppn file), and a Rhino model (.rhn file).
Let's initialize a PicovoiceManager in your React Native app.

import {PicovoiceManager} from '@picovoice/picovoice-react-native';

async createPicovoiceManager() {
    const accessKey = "..."; // your Picovoice AccessKey
    try {
        this._picovoiceManager = await PicovoiceManager.create(
            accessKey,
            '/path/to/keyword.ppn',
            this._wakeWordCallback,
            '/path/to/context.rhn',
            this._inferenceCallback,
            (error) => {
              this._errorCallback(error.message);
            }
        );
    } catch (err) {
        // handle error
    }
}

_wakeWordCallback() {
    // wake word detected!
}

_inferenceCallback(inference) {
    // `inference` has the following fields:
    // (1) isUnderstood
    // (2) intent
    // (3) slots      
}
Enter fullscreen mode Exit fullscreen mode

3. Get Permission to Record Audio
To get the permission on iOS, open your Info.plist and add the following line:

<key>NSMicrophoneUsageDescription</key>
<string>[Permission explanation]</string>
Enter fullscreen mode Exit fullscreen mode

For Android, open your AndroidManifest.xml and add the following line:

<uses-permission android:name="android.permission.RECORD_AUDIO" />
Enter fullscreen mode Exit fullscreen mode

Then check for permission before proceeding with audio capture:

let recordAudioRequest;
if (Platform.OS == 'android') {
    // For Android, we need to explicitly ask
    recordAudioRequest = this._requestRecordAudioPermission();
} else {
    // iOS automatically asks for permission
    recordAudioRequest = new Promise(function (resolve, _) {
    resolve(true);
    });
}

recordAudioRequest.then((hasPermission) => {
    if (!hasPermission) {
        console.error('Required microphone permission was not granted.');        
        return;
      }

    // start feeding Picovoice
    this._picovoiceManager?.start().then((didStart) => {
    if (didStart) {
      // let app know we're ready to go
    }
  });

});

async _requestRecordAudioPermission() {
    const granted = await PermissionsAndroid.request(
    PermissionsAndroid.PERMISSIONS.RECORD_AUDIO,
    {
        title: 'Microphone Permission',
        message: '[Permission explanation]',
        buttonNeutral: 'Ask Me Later',
        buttonNegative: 'Cancel',
        buttonPositive: 'OK',
    }
    );
    return (granted === PermissionsAndroid.RESULTS.GRANTED)
  }
Enter fullscreen mode Exit fullscreen mode

Once .start() called, Picovoice is listening for the hotword “PicoClock” and follow-up commands.

4. Controlling the App With Voice Inputs
In the source code, you can find a simple clock app with three main components: a clock that shows the time, a timer, and a stopwatch.
Let's connect these three components to the Voice User Interface (VUI):

_wakeWordCallback(keywordIndex){    
  // turn mic blue to show we're listening
  this.setState({    
    isListening: true
  });
}

_inferenceCallback(inference) {
  var tab = this.state.activeTab;  
  if (inference.isUnderstood) {         
    if (inference.intent === 'clock') {
      // show clock
      tab = 'clock';
    } else if (inference.intent === 'timer') {
      // control timer operation
      this._performTimerCommand(inference.slots);
      tab = 'timer';
    } else if (inference.intent === 'setTimer') {
      // set timer duration
      this._setTimer(inference.slots);
      tab = 'timer';
    } else if (inference.intent === 'alarm') {
      // control alarm operation
      this._performAlarmCommand(inference.slots);
      tab = 'clock';
    } else if (inference.intent] === 'setAlarm') {
      // set alarm time
      this._setAlarm(inference.slots);
      tab = 'clock';
    } else if (inference.intent === 'stopwatch') {
      // control stopwatch operation
      this._performStopwatchCommand(inference.slots);
      tab = 'stopwatch';
    }
  }

  // change active tab and show we've stopped listening
  this.setState({
    activeTab: tab,
    isListening: false,
  });
}
Enter fullscreen mode Exit fullscreen mode

Then connect each intent to a specific action taken in the app and pass in the intent’s slots as arguments.

_setTimer(slots) {
  var hours = 0;
  var minutes = 0;
  var seconds = 0;

  // parse duration
  if (slots['hours'] != null) {
    hours = Number.parseInt(slots['hours']);
  }
  if (slots['minutes'] != null) {
    minutes = Number.parseInt(slots['minutes']);
  }
  if (slots['seconds'] != null) {
    seconds = Number.parseInt(slots['seconds']);
  }

  // set timer
  this.setState({
    timerCurrentTime: moment.duration({
      hour: hours,
      minute: minutes,
      second: seconds,
      millisecond: 0,
    }),
    isTimerRunning: true,
  });
}
Enter fullscreen mode Exit fullscreen mode

Voila! Once you connect all the functions with the VUI, you have a hands-free and cross-platform clock app.

Resources:
Original article
Source Code
Picovoice React Native SDK

Top comments (0)