Unlike cloud speech-to-text APIs, Cheetah Streaming Speech-to-Text processes speech data locally on-device. Thus, it has a default advantage over the cloud speech-to-text APIs when it comes to speed. No cloud speech-to-text API can eliminate the network latency. They can be fast, but to a certain degree.
Let's learn to convert live audio to text using Picovoice Cheetah Streaming Speech-to-Text Python SDK, so you can see it yourself.
1. Install Cheetah Streaming Speech-to-Text Python SDK
pip install pvcheetah
2. Grab your AccessKey
from **Picovoice Console**
If you do not have an existing Picovoice Console Account, create one in minutes. No credit card is required. You can enjoy the Forever-Free Plan, as the name suggests, forever!
3. Import Cheetah Streaming Speech-to-Text package:
import pvcheetah
4. Create an instance of the speech-to-text object with your AccessKey
:
handle = pvcheetah.create(access_key)
Don't forget to replace the placeholder with your AccessKey
5. Implement audio recording.
Cheetah Streaming Speech-to-Text processes audio whether it comes from a microphone or another program.
For the following, we assume there is a function available to us that provides the next available audio chunk (frame) as below.
def get_next_audio_frame():
pass
Convert live audio to text:
while True:
partial_transcript, is_endpoint = handle.process(get_next_audio_frame())
if is_endpoint:
final_transcript = handle.flush()
That's it! In 5 simple steps, you can get live audio converted into text!
For more information, you can check
Top comments (0)