Day 3: How to add subtitles to YouTube videos with Python

#100daysofcode #python #challenge #beginners

You can use automatic captioning and YouTube Studio to add subtitles to your videos. However, it might not be accurate enough or you may just want to build your own tool.

Here's how:
1.Extract Audio
First, extract the audio from your video content. You can accomplish this using a tool such as FFmpeg.

2. Install Speech Recognition SDK
Install Leopard STT Python SDK:

pip install "pvleopard>=1.1"

import pvleopard
leopard = pvleopard.create(access_key=access_key)

3.Transcribe Audio to Text

transcript, words = leopard.process_file(audio_path)

Leopard returns the transcription as an str with word-level metadata including timestamps and confidence.

[
    {
        "word": "it's",
        "start_sec": 8.58,
        "end_sec": 8.70,
        "confidence": 0.78
    },
    {
        "word": "important",
        "start_sec": 8.77,
        "end_sec": 9.12,
        "confidence": 0.99
    },
    ...
]

4. Convert to SRT
You need SRT (SubRip subtitle) format to store subtitles. Here's a snippet of an example .srt file:

0
00:00:08,576 --> 00:00:11,711
it's important for you to know how to mix your own colors to make your color
...

Then transcription should be broken into sections.

When there is a silence between two words we consider it an endpoint, and a section. In other words, the user is done talking and someone (same or different person) will continue talking later.
We should contain only certain number of words in the section to avoid crowding the screen.

Implement these two logics:

def second_to_timecode(x: float) -> str:
    hour, x = divmod(x, 3600)
    minute, x = divmod(x, 60)
    second, x = divmod(x, 1)
    millisecond = int(x * 1000.)

    return '%.2d:%.2d:%.2d,%.3d' % (hour, minute, second, millisecond)

def to_srt(
        words: Sequence[pvleopard.Leopard.Word],
        endpoint_sec: float = 1.,
        length_limit: Optional[int] = 16) -> str:
    def _helper(end: int) -> None:
        lines.append("%d" % section)
        lines.append(
            "%s --> %s" %
            (
                second_to_timecode(words[start].start_sec),
                second_to_timecode(words[end].end_sec)
            )
        )
        lines.append(' '.join(x.word for x in words[start:(end + 1)]))
        lines.append('')

    lines = list()
    section = 0
    start = 0
    for k in range(1, len(words)):
        if ((words[k].start_sec - words[k - 1].end_sec) >= endpoint_sec) or \
                (length_limit is not None and (k - start) >= length_limit):
            _helper(k - 1)
            start = k
            section += 1
    _helper(len(words) - 1)

    return '\n'.join(lines)

5. Save the SRT File
Last but not least, save the file

with open(subtitle_path, 'w') as f:
    f.write(to_srt(words))

Voila!

You can see the full article here: https://picovoice.ai/blog/how-to-create-subtitles-for-any-video-with-python/

Top comments (1)

wardah • Dec 16 '23

This code gives error for words: Sequence[pvleopard.Leopard.Word],

Sequence is not identified. What is it? what library we need to add for it

DEV Community

Day 3: How to add subtitles to YouTube videos with Python

Top comments (1)

Read next

Understanding the MLOps Lifecycle

These 7 Open-Source Tools will Make You the Ultimate Chill Guy!

Best Memes of Anime Girls Holding Programming Books

Day 19: Highlight'em up! 🔖