Glos Code

Posted on Jul 15, 2023 • Edited on Jul 18, 2023

How to use Whisper AI (using Google Colab)

#whisper #ai #googlecolab #speechrecognition

What is Whisper AI?

An automatic speech recognition system called Whisper was trained on 680,000 hours of supervised web-based multilingual and multitasking data. It was created by OpenAI, the same business that produced ChatGPT and DALLE. Whisper has speech recognition capabilities and the ability to multitask, so it can simultaneously create text from audio files or translate languages. Although it is still in development, it has the capacity to be an effective tool for numerous applications.

What is Google Colab?

Python code can be executed online for free using Google Colab. It is a cloud-based Jupyter Notebook environment that doesn't need to be installed. Colab provides a number of features, such as:

The ability to run Python code in a web browser. This implies that you don't need to install any software on your computer in order to use Colab to develop and run Python programmes.
Use of Google's cloud computing and storage capabilities. This means that you won't need to be concerned about your computer's resources when using Colab to run lengthy and intricate Python programmes.
The ability to communicate and work together on initiatives. You can collaborate in real-time on projects by sharing your Colab notebooks with other users.

Why Google Colab?

For Whisper or other Python projects, you may prefer to use Google Colab rather than your personal computer for a number of reasons.

Unlike owning and maintaining a machine, Google Colab is available for free.
Google Colab provides access to strong GPUs that help speed up your Python projects including machine learning.
Google Colab is accessible from everywhere because it is cloud-based.
You can collaborate on projects with others in Google Colab's collaborative environment.
The use of Google Colab does have some possible disadvantages, though.
Google Colab occasionally runs slowly, especially when usage is at its highest.
The storage capacity of Google Colab is constrained.

Step-By-Step Guide

Setup

The following command will download and install the most recent version of Whisper (or update to it):

!pip install git+https://github.com/openai/whisper.git

To update the package to the latest version, run:

!pip install --upgrade --no-deps --force-reinstall git+https://github.com/openai/whisper.git

Additionally, your system must have the ffmpeg command-line programme installed, which is accessible through most package managers:

!sudo apt update && sudo apt install ffmpeg

Usage(Command-line based)

Size	Parameters	English-only model	Multilingual model	Required VRAM	Relative speed
tiny	39 M	`tiny.en`	`tiny`	~1 GB	~32x
base	74 M	`base.en`	`base`	~1 GB	~16x
small	244 M	`small.en`	`small`	~2 GB	~6x
medium	769 M	`medium.en`	`medium`	~5 GB	~2x
large	1550 M	N/A	`large`	~10 GB	1x

Recommended: medium

The following command will transcribe speech in audio files, using the medium model:

!whisper "[Add your audio file, Example: english.wav]" --model medium

The default setting (which selects the small model) works well for transcribing English. To transcribe an audio file containing non-English speech, you can specify the language using the --language option:

!whisper "[Add your language-specific audio file, Example: japanese.wav]" --language [Add language, Example: Japanese]

Adding --task translate will translate the speech into English:

!whisper "[Add your language-specific audio file, Example: japanese.wav]" --language [Add language, Example: Japanese] --task translate

Run the following to view all available options:

!whisper --help

Outro

We appreciate you reading our blog post. I sincerely hope you found it useful and enlightening. Please feel free to leave any questions or comments in the space provided below. I'd be delighted to hear from you.
Please spread the word about this article to your followers and friends if you liked it.
Once more, thanks for reading! I value your assistance.

DEV Community

How to use Whisper AI (using Google Colab)

What is Whisper AI?

What is Google Colab?

Why Google Colab?

Step-By-Step Guide

Setup

Usage(Command-line based)

Outro

Top comments (0)

Read next

What Is Semantic Search With Filters and How to Implement It With Pgvector and Python

DeepSeek R1: Math Model Trades Speed for Accuracy in Complex Problem-Solving

AI Language Models Show Strange "Hyperfitting" Effect When Fine-Tuned for Precision

New 4-Bit Training Method Cuts AI Model Memory Usage in Half While Maintaining Accuracy