Donald Feury

Posted on May 1, 2020 • Edited on Jun 29, 2021 • Originally published at donaldfeury.xyz

Concatenate Videos Together Using ffmpeg!

#ffmpeg #linux #productivity #video

I have found it very useful to concatenate multiple video files together after working on them separately. It turns out, that is rather simple to do with ffmpeg.

How do we do this?

There are three methods I have found thus far:

Using the concat demuxer approach
- This method is very fast as is avoids transcoding
- This method only works if the files have the same video and audio encoding, otherwise artifacts will be introduced
Using file level concatenation approach
- There are some encodings that support file level concatenation, kinda like just using cat on two files in the terminal
- There are very few encodings that can do this, the only one I've used the is MPEG-2 Transport Stream codec (.ts)
Using a complex filtergraph with the concat filter
- This method can concat videos with different encodings
- This will cause a transcoding to a occur, so it takes time and may degrade quality
- The syntax is hard to understand if you've never written complex filtergraphs before for ffmpeg

Lets look at the examples, first the concat demuxer approach:

ffmpeg -f concat -i list.txt -c copy out.mp4

Unlike most ffmpeg commands, this one takes in a text file containing the files we want to concatenate, the text file would look something like this:

file 'video1.mp4'
file 'video2.mp4'

The example for the file level concatenation would look like this:

ffmpeg -i "concat:video1.ts|video2.ts" -c copy out.ts

and the last example would be like so:

ffmpeg -i video1.mp4 -i video2.flv -filter_complex \
"[0:v][0:a][1:v][1:a] concat=n=2:v=1:a=1 [outv] [outa]" \
-map "[outv]" -map "[outa]" out.mp4

This one is probably pretty confusing, so let me explain the complex filtergraph syntax:

Unlike using filters normally with ffmpeg using -vf or -af, when using a complex filtergraph, we have to tell ffmpeg what streams of data we are operating on per filter.

At the start you see:

[0:v][0:a][1:v][1:a]

This translates in plain english to:

Use the video stream of the first input source, use the audio stream from the first input source, use the video stream from the second input source, and use the audio stream from the second input source.

The square bracket syntax indicates:

[index_of_input:stream_type]

Those of us with experience in programming will understand why the index starts at 0 and not 1

Now after we declared what streams we are using, we have a normal filter syntax:

concat=n=2:v=1:a=1

concat is the name of the filter

n=2 is specifying there are two input sources

v=1 indicates each input source has only one video stream and to write only one video stream out as output

a=1 indicates each input source has only one audio stream and to write only one audio stream out as output

Next, we label the streams of data created by the filter using the bracket syntax:

[outv] [outa]

Here, we are calling the newly created video stream outv and the audio stream outa, we need these later when using the -map flag on the output

Lastly, we need to explicitly tell ffmpeg what streams of data to map to the output being written to the file, using the -map option

-map "[outv]" -map "[outa]"

That names look familiar? Its what we labeled the streams created from the concat filter. We are telling ffmpeg:

Don't use the streams directly from the input files, instead use these data streams created by a filtergraph.

And with that, ya let it run and tada, you have concatenated two videos with completely different encodings, hurray!

Top comments (18)

gkarumba • Jul 8 '20 • Edited

Thank you for your awesome write, quick question is it possible to join a video to another at a specific time? Say for example video1.mp4 is 30s long and video2.mp4 is 10s long, I want to join video2 to video1 at exactly 00:00:20s. So in the final output the first 20s are from video1 then the next 10s from video2 and the final 10s from video1. The final video output should be 40s long

Donald Feury • Jul 9 '20

Try something like this and see what you get, this does a re-encode though:

ffmpeg -t 20 -i video1.mp4 -i video2.mp4 -ss 20 -i video1.mp4 -filter_complex \
"[0:v][0:a][1:v][1:a][2:v][2:a]concat=n=3:v=1:a=1[outv][outa]" \
-map "[outv]" -map "[outa]" final.mp4

-t before the first input is indicating we only want 20 seconds of this stream
-ss before the third input indicates to start reading from the stream at twenty seconds in

There might be some overlap when the first section of video 1 end and starts back up again after video 2. With most formats, ffmpeg can't seek exactly to what time you specify, only the closet seek point.

gkarumba • Jul 9 '20 • Edited

thank you, this works perfectly final question how can I introduce a fadeIn and fadeOut to the video being inserted?

Donald Feury • Jul 9 '20

Try something like this, might need tweaking.

This will add a fade out to the end of the first section, a fade in to the start of the second section, a fade out at the end of the second section, and finally a fade in to the start of the last section.

This should mimic the effect of an actual scene transition

ffmpeg -t 20 -i video1.mp4 -i video2.mp4 -ss 20 -i video1.mp4 -filter_complex \
"[0:v]fade=t=out:st=19.5:d=0.5[v0];
[1:v]fade=t=in:st=0:d=0.5,fade=t=out:st=9.5:d=0.5[v1];
[2:v]fade=t=in:st=0:d=0.5[v2];
[v0][0:a][v1][1:a][v2][2:a]concat=n=3:v=1:a=1[outv][outa]" \
-map "[outv]" -map "[outa]" final.mp4

You notice that the argument I passed in for the st value of the fade filters for the fade out effects is basically equal to

CLIP_LENGTH - DURATION_OF_FADE_OUT_EFFECT

venkatesanvp • Jul 24 '20

Thanks a lot, learnt various ways of joining videos , well explained. One question , I have 6 separate mp4 videos for each topics. Now wanted to joins all these videos in one mp4 with text and number displayed till end of that video, for example, first videos with text "Topic 1" till end of first video, once second videos start's text changes to "Topic 2" and ... so on.

Donald Feury • Jul 24 '20

Check out the drawtext filter that ffmpeg has, that should do the trick.

Mystique Rheordan • Oct 9 '20

Input link in1:v0 parameters (size 480x640, SAR 1:1) do not match the corresponding output link in0:v0 parameters (640x360, SAR 1:1)

I get this error when I try to concatenate two webm files. How do I resolve this?

Donald Feury • Oct 9 '20

The videos aren't the same resolutions. You'll have to scale one of them to be the same resolution as the other.

Ivy White • Nov 5 '20 • Edited

Thanks. I tried FFmpeg to merge split videos. I think it is hard for me. I am now using an easy alternative - Joyoshare Video Joiner to do video merging, which is recommended by my friend. She said that the merging software is easy to use and it merges files of the same format and codec in original quality. That's proven true after I try.

qiforra • Sep 20 '20 • Edited

Hello I tried following your method
ffmpeg -i intro.mp4 -i video.mkv -filter_complex "[0:v][0:a][1:v][1:a] concat=n=2:v=1:a=1 [outv] [outa]" -map "[outv]" -map "[outa]" out.mp4
And this is the result
Input link in0:v0 parameters (size 960x1280, SAR 16:9) do not match the corresponding output link in0:v0 parameters (720x720, SAR 16:9)

The first video is made with app intro maker, the second video is made with joining audio and 1 image (using ffmpeg too
ffmpeg -loop 1 -i image.jpg -i audio.ogg -preset ultrafast -c:v libx264 -tune stillimage -c:a aac -b:a 192k -pix_fmt yuv440p -shortest mentahan.mp4),
Before I concat videos using this
ffmpeg -f concat -i part3.txt -c copy out.mp4
But the uploaded result cant be played from telegram, but can be played with android video player
can you help me?

Joseph • Oct 1 '20

Awesome Tut man!!
So inside list.txt if you put n number of files, you can contact the files regardless of the format. I had to work with .wav and worked like a charm.
Thanks once again!!