DEV Community

Cover image for Working with audio in ffmpeg - Part 2
Alan Allard for Eyevinn Video Dev-Team Blog

Posted on

Working with audio in ffmpeg - Part 2

Last year I examined the basics of working with audio in ffmpeg and promised a follow-up in which I demonstrate how to use LV2 plugins to further extend the possibilities for audio manipulation in every video streaming professional's favourite command line tool. And here, at last, is Part 2 of "Working with audio in ffmpeg"!

This demonstration, similarly to last time, uses a short song excerpt which I have made available for you on freesound.org.

To pick up where we left off, we established that ffmpeg already has a wealth of audio manipulation tools built right in as audio filters. There is however much more that you can do without too much effort. By using LV2 plugins, for example, you can take advantage of any number of free plugins and even code your own.

LV2 plugins are the second generation version of an earlier plugin system known as LADSPA - LV2 is an acronym for LADSPA Version 2 plugins. And it's also worth noting that LADSPA (version 1) plugins are also usable inside ffmpeg (this may well be a topic for a future blog too ;) ) and there are plenty of those available too, of course.

You may find that LV2 is not immediately available within your current ffmpeg build. In that case, you will need to build from source. Luckily, this process is well-documented in ffmpeg. I am on macOS so I tried my hand at following those specific instructions, detailed here. They were fairly straightforward, so I may as well document them here. This is also the process for enabling any of the other filters that are not included by default in any particular build. The documentation will usually indicate when this is necessary. There will be occasions when the documentation falls short though, as we will see, so be aware of that.

If you, like me, already had a version of ffmpeg on your machine, getting rid of it is probably the best option. For me, that meant doing a homebrew uninstall ffmpeg.

Next, grab ffmpeg from GitHub:

git clone https://git.ffmpeg.org/ffmpeg.git ffmpeg

then you'll want to get all the necessary dependencies to successfully build ffmpeg. I chose the simplest option in my case; installing those dependencies via Homebrew:

brew install automake fdk-aac git lame libass libtool libvorbis libvpx \
opus sdl shtool texi2html theora wget x264 x265 xvid nasm
Enter fullscreen mode Exit fullscreen mode

then:
cd ffmpeg

At which point it's probably best to try a vanilla build with the following trinity of commands:

./configure
which will take a while then:
make
which can also take a bit then:
make install
which usually doesn't take too long...

If all is well, and you can run something like, say:

ffmpeg -filters

and see a nice long list of filters and their features, then we're ready to setup LV2!

First we also need to install lv2 from Homebrew:

brew install lv2

To use LV2 plugins then, we need to build with the -enable-lv2 flag:

./configure --enable-lv2

then once again:

make
and
make install

Now here you may run into an issue:

ERROR: lilv-0 not found using pkg-config

Don't stress, though. Have a sip of tea/beer and rest easy in the knowledge that all you need to do is another brew install:

brew install lilv

so after this...

make
and
make install

....you'll have had time to finish your tea/beer and get right down to testing LV2 plugins...

Let's dive into an example in the ffmpeg LV2 filter documentation:

ffmpeg -y -i "a different day intro excerpt.wav" -af \
'lv2=p=http\\://calf.sourceforge.net/plugins/Vinyl:c=drone=0.2\
|aging=0.5' vinylizedADD.wav

Enter fullscreen mode Exit fullscreen mode

BUT - notice that I found it necessary to alter the escaped characters in the url for it be correctly formed. So that's one thing to think about. The other is that this doesn't work at all...because you need to install the Vinyl plugin!

The Vinyl plugin, as you may have guessed from the url, is part of the Calf plugin suite, an impressive selection of free - in all senses - professional audio tools. Let's grab them:

brew tap david0/homebrew-audio
and
brew install calf

Now you will be able to run the last ffmpeg command and hear a vinyl effect on the audio excerpt. Nice! Let's immediately make that more drastic with our own version.

I noticed in the source code documentation that the plugin GUI has several other parameters, some of them controlled by switches. So how on earth would we access those in the command line? Let's find out more about our parameters with:

ffmpeg -y -i "a different day intro excerpt.wav" -af\
'lv2=p=http\\://calf.sourceforge.net/plugins/Vinyl:c=help' \
vinylizedADD.wav
Enter fullscreen mode Exit fullscreen mode

Which unpolitely - but, admittedly, helpfully - spews out the following:

The 'http://calf.sourceforge.net/plugins/Vinyl' plugin has the following input controls:
[Parsed_lv2_0 @ 0x7faa2c005d00] bypass          <float> (from 0.000000 to 1.000000) (default 0.000000)             Bypass
[Parsed_lv2_0 @ 0x7faa2c005d00] level_in                <float> (from 0.015625 to 64.000000) (default 1.000000)            Input Gain
[Parsed_lv2_0 @ 0x7faa2c005d00] level_out               <float> (from 0.015625 to 64.000000) (default 1.000000)            Output Gain
[Parsed_lv2_0 @ 0x7faa2c005d00] drone           <float> (from 0.000000 to 1.000000) (default 0.000000)             Drone
[Parsed_lv2_0 @ 0x7faa2c005d00] speed           <float> (from 33.000000 to 78.000000) (default 33.000000)          Speed
[Parsed_lv2_0 @ 0x7faa2c005d00] aging           <float> (from 0.000000 to 1.000000) (default 0.000000)             Aging
[Parsed_lv2_0 @ 0x7faa2c005d00] freq            <float> (from 600.000000 to 1800.000000) (default 1000.000000)             Frequency
[Parsed_lv2_0 @ 0x7faa2c005d00] gain0           <float> (from 0.000016 to 1.000000) (default 0.007812)             Vol 0
[Parsed_lv2_0 @ 0x7faa2c005d00] pitch0          <float> (from -1.000000 to 1.000000) (default 0.000000)            Pitch 0
[Parsed_lv2_0 @ 0x7faa2c005d00] active0         <float> (from 0.000000 to 1.000000) (default 0.000000)             Activate 0
[Parsed_lv2_0 @ 0x7faa2c005d00] gain1           <float> (from 0.000016 to 1.000000) (default 0.007812)             Vol 1
[Parsed_lv2_0 @ 0x7faa2c005d00] pitch1          <float> (from -1.000000 to 1.000000) (default 0.000000)            Pitch 1
[Parsed_lv2_0 @ 0x7faa2c005d00] active1         <float> (from 0.000000 to 1.000000) (default 0.000000)             Activate 1
[Parsed_lv2_0 @ 0x7faa2c005d00] gain2           <float> (from 0.000016 to 1.000000) (default 0.015625)             Vol 2
[Parsed_lv2_0 @ 0x7faa2c005d00] pitch2          <float> (from -1.000000 to 1.000000) (default 0.000000)            Pitch 2
[Parsed_lv2_0 @ 0x7faa2c005d00] active2         <float> (from 0.000000 to 1.000000) (default 0.000000)             Activate 2
[Parsed_lv2_0 @ 0x7faa2c005d00] gain3           <float> (from 0.000016 to 1.000000) (default 0.007812)             Vol 3
[Parsed_lv2_0 @ 0x7faa2c005d00] pitch3          <float> (from -1.000000 to 1.000000) (default 0.000000)            Pitch 3
[Parsed_lv2_0 @ 0x7faa2c005d00] active3         <float> (from 0.000000 to 1.000000) (default 0.000000)             Activate 3
[Parsed_lv2_0 @ 0x7faa2c005d00] gain4           <float> (from 0.000016 to 1.000000) (default 0.031250)             Vol 4
[Parsed_lv2_0 @ 0x7faa2c005d00] pitch4          <float> (from -1.000000 to 1.000000) (default 0.000000)            Pitch 4
[Parsed_lv2_0 @ 0x7faa2c005d00] active4         <float> (from 0.000000 to 1.000000) (default 0.000000)             Activate 4
[Parsed_lv2_0 @ 0x7faa2c005d00] gain5           <float> (from 0.000016 to 1.000000) (default 0.062500)             Vol 5
[Parsed_lv2_0 @ 0x7faa2c005d00] pitch5          <float> (from -1.000000 to 1.000000) (default 0.000000)            Pitch 5
[Parsed_lv2_0 @ 0x7faa2c005d00] active5         <float> (from 0.000000 to 1.000000) (default 0.000000)             Activate 5
[Parsed_lv2_0 @ 0x7faa2c005d00] gain6           <float> (from 0.000016 to 1.000000) (default 0.062500)             Vol 6
[Parsed_lv2_0 @ 0x7faa2c005d00] pitch6          <float> (from -1.000000 to 1.000000) (default 0.000000)            Pitch 6
[Parsed_lv2_0 @ 0x7faa2c005d00] active6         <float> (from 0.000000 to 1.000000) (default 0.000000)             Activate 6
Enter fullscreen mode Exit fullscreen mode

So let's try that more drastic version of the same:

ffmpeg -y -i "a different day intro excerpt.wav" -af\
'lv2=p=http\\://calf.sourceforge.net/plugins/Vinyl:c=drone=1.0|aging=1.0' \
vinylizedADD.wav
Enter fullscreen mode Exit fullscreen mode

Hmm, needs to be louder (not much though, those resonant frequencies can be a bit nasty). And let's add a speed change too (this changes the LFO speed of the drone, in fact).

ffmpeg -y -i "a different day intro excerpt.wav" -af \
'lv2=p=http\\://calf.sourceforge.net/plugins/Vinyl:c=drone=1.0|aging=1.0\
|level_out=2.0|speed=45.0' vinylizedADD.wav
Enter fullscreen mode Exit fullscreen mode

So what about these several gain, pitch and activate parameters? It seems from examining the GUI of this plugin that these are basically seven file players with various different vinyl-style effects supplied as looping audio files. Let's add a couple:

ffmpeg -y -i "a different day intro excerpt.wav" -af \
'lv2=p=http\\://calf.sourceforge.net/plugins/Vinyl:c=drone=0.6|aging=0.6\
|level_out=2.0|speed=45.0|active5=1.0|gain5=1.0|pitch5=1.0|active6=1.0\
|gain6=1.0|pitch6=-1.0' vinylizedADD.wav
Enter fullscreen mode Exit fullscreen mode

Oddly enough, those audio effects don't loop. Why is that? Well, who knows...but you can track that bug here. From reading that, it transpires that one of them does actually loop correctly. By trial and error, I found it - gain/pitch/active1:

ffmpeg -y -i "a different day intro excerpt.wav" -af \
'lv2=p=http\\://calf.sourceforge.net/plugins/Vinyl:c=drone=0.6|aging=0.6\
|level_out=2.0|speed=45.0|active1=1.0|\
gain1=0.1|pitch1=1.0' vinylizedADD.wav

Enter fullscreen mode Exit fullscreen mode

Well that's lovely, how about some reverb?

ffmpeg -y -i "a different day intro excerpt.wav" -af \
'lv2=p=http\\://calf.sourceforge.net/plugins/Reverb:c=help' \
vinylizedADD.wav
Enter fullscreen mode Exit fullscreen mode

Ok then, these are the params this time:

The 'http://calf.sourceforge.net/plugins/Reverb' plugin has the following input controls:
[Parsed_lv2_0 @ 0x7f954b00c700] decay_time              <float> (from 0.400000 to 15.000000) (default 1.500000)            Decay time
[Parsed_lv2_0 @ 0x7f954b00c700] hf_damp         <float> (from 2000.000000 to 20000.000000) (default 5000.000000)           High Frq Damp
[Parsed_lv2_0 @ 0x7f954b00c700] room_size               <float> (from 0.000000 to 5.000000) (default 2.000000)             Room size
[Parsed_lv2_0 @ 0x7f954b00c700] diffusion               <float> (from 0.000000 to 1.000000) (default 0.500000)             Diffusion
[Parsed_lv2_0 @ 0x7f954b00c700] amount          <float> (from 0.000000 to 2.000000) (default 0.250000)             Wet Amount
[Parsed_lv2_0 @ 0x7f954b00c700] dry             <float> (from 0.000000 to 2.000000) (default 1.000000)             Dry Amount
[Parsed_lv2_0 @ 0x7f954b00c700] predelay                <float> (from 0.000000 to 500.000000) (default 0.000000)           Pre Delay
[Parsed_lv2_0 @ 0x7f954b00c700] bass_cut                <float> (from 20.000000 to 20000.000000) (default 300.000000)              Bass Cut
[Parsed_lv2_0 @ 0x7f954b00c700] treble_cut              <float> (from 20.000000 to 20000.000000) (default 5000.000000)             Treble Cut
[Parsed_lv2_0 @ 0x7f954b00c700] on              <float> (from 0.000000 to 1.000000) (default 1.000000)             Active
[Parsed_lv2_0 @ 0x7f954b00c700] level_in                <float> (from 0.015625 to 64.000000) (default 1.000000)            Input Gain
[Parsed_lv2_0 @ 0x7f954b00c700] level_out               <float> (from 0.015625 to 64.000000) (default 1.000000)            Output Gain
Enter fullscreen mode Exit fullscreen mode

So how about:

ffmpeg -y -i "a different day intro excerpt.wav" -af \
'lv2=p=http\\://calf.sourceforge.net/plugins/Reverb:c=amount=0.9\
|amount=1.0|room_size=5.0|hf_damp=2000.0|predelay=5.0\
|diffusion=1.0' reverbedADD.wav
Enter fullscreen mode Exit fullscreen mode

It would be nice to hear the reverb tail on that, but I haven't yet given thought on how to do that. (Please let me know if you work it out). There is a lot more to explore within just the Calf audio suite, let alone all of the other free alternatives out there. Perhaps one day you might find you need one of these tools, be it for a quick fix or a professional editing repair...

One final note: the image for this blog post is a frequency plot generated from exactly the audio excerpt that I have been using in this blog post. Naturally, this was also generated within ffmpeg:

ffmpeg -i "a different day intro excerpt.wav" -lavfi \
showspectrumpic=s=1280x480:scale=log:color=fiery:legend=0 add.png
Enter fullscreen mode Exit fullscreen mode

(Chances are I'll look closer at showspectrumpic in a future blog post...)

Alan Allard is a developer at Eyevinn Technology, the European leading independent consultancy firm specializing in video technology and media distribution.

If you need assistance in the development and implementation of this, our team of video developers are happy to help out. If you have any questions or comments just drop us a line in the comments section to this post.

Top comments (0)