DEV Community

loading...

Use Free Software to collect language study material

Sérgio Araújo
I am a Free Software enthusiast and a (neo)?vim addicted, I also like shell script, sed, awk, and as you can see I love Regular Expressions.
Updated on ・3 min read

Finding out good content

Everybody knows or at least have heard that tv-series are perfect for learning a second language and there other endless technics we can use for this purpose.

Among many other ways of studying English, my target language, one technique I like makes use of pieces of tv-series, not whole episodes, because sitcoms' nature allows we to do so.

Downloading the transcriptions

On the site: https://subscene.com/subtitles/searchbytitle you can download entire seasons. In my case I use The Big Bang Theory. You can also change this link for downloading specific seasons. The idea here is to have all the transcriptions for all seasons.

Once you have it you can use grep or Ag for filtering the dialogs you want to study.

Convert srt to txt

It is easier to convert existent srt files instead of finding txt ones. after umpacking just run these two commands:

sed -i '/^[0-9][0-9]/d' *.srt
sed -i '/^[0-9][0-9]\?/d' *.srt
Enter fullscreen mode Exit fullscreen mode

Renaming all files

Let's say you have this pattern:

The Big Bang Theory S01E01 Pilot (1080p H265 Joy).srt
The Big Bang Theory S01E02 The Big Bran Hypothesis (1080p H265 Joy).srt
The Big Bang Theory S01E03 The Fuzzy Boots Corollary (1080p H265 Joy).srt
The Big Bang Theory S01E04 The Luminous Fish Effect (1080p H265 Joy).srt
Enter fullscreen mode Exit fullscreen mode

Open the directory with ranger file manager, select the files with V and type:

:bulkrename
Enter fullscreen mode Exit fullscreen mode

To just keep the names like S01E01 just do this:

:%norm dtS
:%norm whdt.
:%s/srt/txt
Enter fullscreen mode Exit fullscreen mode

The first command deletes until before the first big "S" The second command jumps to the second word, moves one character back and deletes until (but not icluded), next dot.

Renaming files using perl-rename

perl-rename -n 's/.*(S\d{2}E\d{2}).*\.(srt)$/$1.txt/' *.srt
Enter fullscreen mode Exit fullscreen mode

OBS: The -n option makes perl-rename run in dry-run mode, as soon you see the proper output remove it.

Proper way to install perl-rename

A lot of manuals cite rename as perl-renam but are differences
to get real rename follow this instructions:

cpan
cpan1> install File::Rename

sudo ln -s /usr/local/bin/rename /usr/bin/rename.pl
Enter fullscreen mode Exit fullscreen mode

So, I find out one clip of tv-series on youtube, let's say this one and I hit play and try to catch a piece of the dialog.

At the beginning of the above scene we have the line:

"I can't believe it"
Enter fullscreen mode Exit fullscreen mode

So then, to figure out which season and episode has that dialog just type:

ag "I can't believe it" . -il
Enter fullscreen mode Exit fullscreen mode

If you are at the transcriptions folder you will see:

S07E01.txt
Enter fullscreen mode Exit fullscreen mode

Now open this file with vim:

1 - Mark the dialog start with ma
2 - Mark the dialog end with mb
3 - Copy the region to the clipboard with :'a,'b y+

Save your link and dialog on the google keep After studying the scene you can exercise repeated listenings to memorize the way natives speak in real conversations.

Use your youtube history and download your audio with termux

If you have just the youtube-dl app on your phone just type

youtube-dl -x --audioformat link
Enter fullscreen mode Exit fullscreen mode

Discussion (0)