The tutorial for day 21 is voice enabled motion detection camera with RPI.
We'll be using Raspberry Pi 3 and a small USB mic. However, you can choose another board. Picovoice supports RPI variants.
Let's config the mic first. List of available input audio devices:
arecord -L | grep plughw
The output should be similar to below:
plughw:CARD=PCH,DEV=0
Copy this line and create a .asoundrc
file in your home folder with these options:
pcm.!default {
type asym
capture.pcm "mic"
}
pcm.mic {
type plug
slave {
pcm "${INPUT_AUDIO_DEVICE}"
}
}
Replace ${INPUT_AUDIO_DEVICE}
with what you copied earlier. You may need to reboot the system for these settings to take effect.
After that, grab the Picovoice Python
package:
sudo pip3 install picovoice picovoicedemo
The picovoicedemo
package let us test our models rapidly. The package requires a valid AccessKey
. You can get your AccessKey for free from the Picovoice Console.
We'll also design a voice interface using Picovoice Console.
For this tutorial we'll focus on three main command types (intents)
- Turning cameras on/off.
- Cleaning old files to make space for new pictures.
- Emailing the log files.
In order to design our Voice User Interface (VUI) we use Rhino Speech-to-Intent for custom voice commands. You can use the below YAML file to start with and then enrich or change it as you wish. Use Import YAML
functionality on Picovoice Console if you decide to use the below.
context:
expressions:
changeCameraState:
- "[switch, turn] $state:state (all, the) cameras"
- "[switch, turn] (all, the) cameras $state:state"
- "[switch, turn] $state:state (the) $location:location (camera, cameras)"
- "[switch, turn] (the) $location:location [camera, cameras] $state:state"
- "[switch, turn] $state:state (the) [light, lights] [at, in] (the)
$location:location"
- "[switch, turn] (the) [light, lights] [in, at] the $location:location
$state:state"
cleanCameraHistory:
- "[delete, remove, clean] all (the) [files, videos] older than
$pv.TwoDigitInteger:day [day, days]"
- "[delete, remove, clean] all (the) [files, videos] older than
$pv.TwoDigitInteger:month [month, months]"
emailLog:
- "[email, mail] (me) (all) the [log, logs]"
- "[email, mail] (me) a report"
slots:
state:
- off
- on
location:
- garage
- entrance
- front door
- back door
- driveway
- yard
- stairway
- Hallway
Train, download and extract the model into your home folder. We used “Computer” as the wake-word. However, you can train another one with Porcupine Wake Word Detection on Picovoice Console.
The next step is integrating the voice interface into the existing motion camera detection project. Picovoice eases this step by offering a Python SDK. We need to only modify the wake_word_callback
and inference_callback
functions based on the context model’s intents:
from picovoice import Picovoice
keyword_path = ...
def wake_word_callback():
# wake word detected
pass
context_path = ...
def inference_callback(inference):
# `inference` exposes three immutable fields:
# (1) `is_understood`
# (2) `intent`
# (3) `slots`
pass
handle = Picovoice(
access_key=${ACCESS_KEY},
keyword_path=keyword_path,
wake_word_callback=wake_word_callback,
context_path=context_path,
inference_callback=inference_callback)
while True:
handle.process(get_next_audio_frame())
You just need to replace the access and put the path for speech models whether you save them in your downloads or desktop folder.
For more detailed information, you can refer to the Python API documentation.
Top comments (0)