Are you struggling to create a model that can detect the keyword "R2D2"? I faced a similar challenge until I came across a helpful video on Training Models with Synthetic Data: End-to-End Keyword Spotting Walkthrough with Edge Impulse.
However, I encountered some issues while following the video tutorial and had to troubleshoot them.
In this article, I will share the steps I took to overcome these obstacles, in case you run into the same problems.
Here's a quick summary of the issues I faced:
- hidden files not accessible warning: jupyter lab Refusing to serve hidden file, via 404 Error, use flag 'ContentsManager.allow_hidden' to enable
- ffmpeg and ffprobe not recognised
- DefaultCredentialsError
Some pre-requisites
First, ensure that you have node installed. If not, you can follow this tutorial to install it.
If npm
is not recognised, run source ~/.bash_profile
to load it into the current shell.
Next, install Python using pyenv
by following that tutorial. Once installed, you can install version 3.10.8 by running the following command:
pyenv install -v 3.10.8
Setting up the Environment
Clone the Edge Impulse notebooks repository by running:
git clone https://github.com/edgeimpulse/notebooks.git
Navigate to the notebooks folder inside it:
cd notebooks/notebooks/
Create a virtual environment for Python 3.10.8:
pyenv virtualenv 3.10.8 edgeimpulsenotebooks
Activate the new virtual environment:
pyenv local edgeimpulsenotebooks
Install the required dependencies:
pip install requests
pip install pydub
pip install google-cloud-texttospeech
pip install jupyterlab
When I ran the notebook, I encountered issues with ffmpeg and ffprobe not being found. To resolve this, I followed the instructions outlined here.
pip install ffmpeg-downloader
ffdl install --add-path
After the installation completes, you will be prompted to ensure that ffmpeg is in your path. Copy and paste the provided line into your ~/.zsh_profile (or ~/.bash_profile if you're using that instead) and save the file.
export PATH="/Users/dan.benitah/Library/Application Support/ffmpeg-downloader/ffmpeg:${PATH}"
To ensure that ffmpeg is loaded, run:
source ~/.zsh_profile
... and test the command:
ffmpeg
NOTE: I found it useful to test my commands directly in the jupiter-labs' terminal window to confirm what was not recognised or erroring more clearly.
One last thing before we can launch the lab. I kept on getting a warning that hidden files were not accessible and using --ContentsManager.allow_hidden=True
seems to have solved it.
You should now be able to launch the Jupyter Lab environment from the shell where all these commands are available (npm, ffmpeg, ffprobe...) :
jupyter-lab --ContentsManager.allow_hidden=True
Finally, to follow along with the Google instructions from the video, enable the API in the Google Console at https://console.cloud.google.com/speech/text-to-speech. It's worth noting that enabling the Text-to-Speech API alone might not be sufficient. In my case, I also needed to enable this API for it to work (may have been a combination of things).
To check if your APIs are enabled, look at the notification bar for notifications like this:
Once you've completed these steps, you should be able to resume watching the video and follow along with the instructions. If you encounter any issues or have any questions, feel free to leave a comment below.
Next:
Now the model detects when I say "R2D2" (yes that was my word :) ), I will need to look at these next steps:
- embed the library into a new project to act on detected keywords
- iterate on the model to improve its detection
Top comments (0)