DEV Community

Julien Simon
Julien Simon

Posted on • Originally published at julsimon.Medium on

Talk @ 1729: “Audio classification with Hugging Face transformers”

In this video, I show how to use fine-tune a state of the art Conformer model for audio keyword classification, and build a Gradio Space to showcase it. I also quickly test the model with distorted audio to see how resilient it it.

Dataset: https://huggingface.co/datasets/speech_commands

Base model: https://huggingface.co/facebook/wav2vec2-conformer-rel-pos-large

Fine-tuned model: https://huggingface.co/juliensimon/wav2vec2-conformer-rel-pos-large-finetuned-speech-commands

Space: https://huggingface.co/spaces/juliensimon/keyword-spotting

Notebook: https://gitlab.com/juliensimon/huggingface-demos/-/tree/main/keyword-spotting

Latest comments (0)