DEV Community

Cover image for Voice Enabled Motion Detection Camera with Raspberry Pi
Dilek Karasoy for Picovoice

Posted on

Voice Enabled Motion Detection Camera with Raspberry Pi

The tutorial for day 21 is voice enabled motion detection camera with RPI.

We'll be using Raspberry Pi 3 and a small USB mic. However, you can choose another board. Picovoice supports RPI variants.

Let's config the mic first. List of available input audio devices:

arecord -L | grep plughw
Enter fullscreen mode Exit fullscreen mode

The output should be similar to below:

plughw:CARD=PCH,DEV=0
Enter fullscreen mode Exit fullscreen mode

Copy this line and create a .asoundrc file in your home folder with these options:

pcm.!default {
   type asym
   capture.pcm "mic"
}
pcm.mic {
   type plug
   slave {
      pcm "${INPUT_AUDIO_DEVICE}"
   }
}
Enter fullscreen mode Exit fullscreen mode

Replace ${INPUT_AUDIO_DEVICE} with what you copied earlier. You may need to reboot the system for these settings to take effect.

After that, grab the Picovoice Pythonpackage:

sudo pip3 install picovoice picovoicedemo
Enter fullscreen mode Exit fullscreen mode

The picovoicedemo package let us test our models rapidly. The package requires a valid AccessKey. You can get your AccessKey for free from the Picovoice Console.

We'll also design a voice interface using Picovoice Console.

For this tutorial we'll focus on three main command types (intents)

  1. Turning cameras on/off.
  2. Cleaning old files to make space for new pictures.
  3. Emailing the log files.

Creating a simple VUI on Picovoice Console

In order to design our Voice User Interface (VUI) we use Rhino Speech-to-Intent for custom voice commands. You can use the below YAML file to start with and then enrich or change it as you wish. Use Import YAML functionality on Picovoice Console if you decide to use the below.

context:
  expressions:
    changeCameraState:
      - "[switch, turn] $state:state (all, the) cameras"
      - "[switch, turn] (all, the) cameras $state:state"
      - "[switch, turn] $state:state (the) $location:location (camera, cameras)"
      - "[switch, turn] (the) $location:location [camera, cameras] $state:state"
      - "[switch, turn] $state:state (the) [light, lights] [at, in] (the)
        $location:location"
      - "[switch, turn] (the) [light, lights] [in, at] the $location:location
        $state:state"
    cleanCameraHistory:
      - "[delete, remove, clean] all (the) [files, videos] older than
        $pv.TwoDigitInteger:day [day, days]"
      - "[delete, remove, clean] all (the) [files, videos] older than
        $pv.TwoDigitInteger:month [month, months]"
    emailLog:
      - "[email, mail] (me) (all) the [log, logs]"
      - "[email, mail] (me) a report"
  slots:
    state:
      - off
      - on
    location:
      - garage
      - entrance
      - front door
      - back door
      - driveway
      - yard
      - stairway
      - Hallway
Enter fullscreen mode Exit fullscreen mode

Train, download and extract the model into your home folder. We used “Computer” as the wake-word. However, you can train another one with Porcupine Wake Word Detection on Picovoice Console.

The next step is integrating the voice interface into the existing motion camera detection project. Picovoice eases this step by offering a Python SDK. We need to only modify the wake_word_callback and inference_callback functions based on the context model’s intents:

from picovoice import Picovoice

keyword_path = ...

def wake_word_callback():
    # wake word detected
    pass
context_path = ...

def inference_callback(inference):
    # `inference` exposes three immutable fields:
    # (1) `is_understood`
    # (2) `intent`
    # (3) `slots`
    pass

handle = Picovoice(
        access_key=${ACCESS_KEY},
        keyword_path=keyword_path,
        wake_word_callback=wake_word_callback,
        context_path=context_path,
        inference_callback=inference_callback)

while True:
    handle.process(get_next_audio_frame())
Enter fullscreen mode Exit fullscreen mode

You just need to replace the access and put the path for speech models whether you save them in your downloads or desktop folder.

For more detailed information, you can refer to the Python API documentation.

Top comments (0)