DEV Community

Cover image for Live Audio Transcription with Python For Free
Dilek Karasoy for Picovoice

Posted on

Live Audio Transcription with Python For Free

Unlike cloud speech-to-text APIs, Cheetah Streaming Speech-to-Text processes speech data locally on-device. Thus, it has a default advantage over the cloud speech-to-text APIs when it comes to speed. No cloud speech-to-text API can eliminate the network latency. They can be fast, but to a certain degree.

Let's learn to convert live audio to text using Picovoice Cheetah Streaming Speech-to-Text Python SDK, so you can see it yourself.


1. Install Cheetah Streaming Speech-to-Text Python SDK

pip install pvcheetah
Enter fullscreen mode Exit fullscreen mode

2. Grab your AccessKey from **Picovoice Console**
If you do not have an existing Picovoice Console Account, create one in minutes. No credit card is required. You can enjoy the Forever-Free Plan, as the name suggests, forever!

3. Import Cheetah Streaming Speech-to-Text package:

import pvcheetah
Enter fullscreen mode Exit fullscreen mode

4. Create an instance of the speech-to-text object with your AccessKey:

handle = pvcheetah.create(access_key)
Enter fullscreen mode Exit fullscreen mode

Don't forget to replace the placeholder with your AccessKey

5. Implement audio recording.
Cheetah Streaming Speech-to-Text processes audio whether it comes from a microphone or another program.

For the following, we assume there is a function available to us that provides the next available audio chunk (frame) as below.

def get_next_audio_frame():
    pass
Enter fullscreen mode Exit fullscreen mode

Convert live audio to text:

while True:
    partial_transcript, is_endpoint = handle.process(get_next_audio_frame())
    if is_endpoint:
        final_transcript = handle.flush()
Enter fullscreen mode Exit fullscreen mode

That's it! In 5 simple steps, you can get live audio converted into text!


For more information, you can check

Top comments (0)