Good question. 🤔
Introduction
I have been playing around with a YouTube clone I call FooTube. I had set up video uploads to be sent from the browser to an AWS S3 bucket, so the video file did not touch my node backend This made server-side video processing a non-starter. This put me in a dilemma because I wanted to generate 3 thumbnails for each video upload like the real YouTube does. I started thinking about creating a video player off-screen and using canvas to stream things around. While that might be possible, it didn't sound like fun, and thats not what I ended up doing.
The research began.
I discovered that YouTube uses deep neural networks to pick out thumbnails that display a subject or a face or something else that draws attention. They also capture a thumbnail for every second of video and use an algorithm to rank each one. This interesting article written by the YouTube Creator team from 2015 explains further. At this point I decided that just getting 3 thumbnail images would enough of a challenge for me - since I still had no clue what I was doing. 🤦♂️
Companion Video
Disclaimer
Please keep in mind this code is NOT meant to be a production ready solution, it is more an exploration or proof of concept. There are a lot of moving parts, and while I have managed to get this working in my local environment, I simply cannot guarantee it will work anywhere else! Sorry.
Lambda Functions
The first thing I found out was that I could use AWS Lambda to sort of outsource computations that might normally take place on a server. As a bonus, since I was already using S3, I could attach what amounts to an event listener to trigger my Lambda function when I uploaded a video file.
Creating a new Lambda function is straightforward. When prompted you want to chose create a function from scratch and come up with a decent name; createThumbail
worked for me. Also, select the Node.js 8.10 runtime.
IAM Role Permissions
I had to create a new IAM role to execute this function. This can be done through a simple work flow in the IAM console. Name the role whatever you want but give it the AWSLambdaExecute
permission. This will allow for PUT
and GET
access to S3 and full access to CloudWatch Logs. These are all the permissions we need to execute and monitor our createThumbnail
Lambda function. I had to add the arn
for this role to my bucket policy.
{
"Sid": "Stmt**************",
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::**********:role/LambdaRole"
]
},
"Action": [
"s3:GetObject",
"s3:PutObject"
],
"Resource": "arn:aws:s3:::bucket/*"
}
Triggers
Next we need to configure the trigger for our function. We want to listen to the bucket we are uploading videos to and watch for the PUT
method since that is the method used to send the video. Optionally, you can set a prefix
and/or suffix
to narrow down the trigger. My function saves the thumbnails to this same bucket. In this case you might use a suffix
of mp4
or webm
(video formats). My videos were going to the user folder so I set a prefix of user/
since this would be at the beginning of any key.
Once your function is created and its trigger configured, these settings will show up in the S3 bucket referenced by said trigger. In fact they can be set from either S3 or Lambda consoles. Click the Properties
tab then the Events
box in the S3 console to view events associated with a bucket.
Getting Code to Lambda
There are a few ways to get code into our Lambda function. AWS provides a online code editor if your package size less than 3MB. You can also upload a package in the form of a zip file directly to Lambda or upload a zip file to S3 and then link that to your function. This zip format allows multiple files to be included in your bundle, including typical node_modules
dependencies as well as executable files.
In fact, we are going to utilize a couple executable files to help process our video. ffmpeg
is a command line tool to convert multimedia files and ffprobe
is a stream analyzer. You might have these tools installed locally but we need to use static builds on Lambda. Download choices can be found here. I chose https://johnvansickle.com/ffmpeg/releases/ffmpeg-release-amd64-static.tar.xz
. To unpack the compressed contents I used 7-Zip. Once unpacked we want to isolate the files ffmpeg
and ffprobe
, go figure.
Note that user,group and global all have read/execute permissions. I am on Windows and had a problem keeping these permissions. Lambda permissions are a little tricky, and global read is important for all files. On Windows the problem arose when I attempted the next step.
To get our executable files to Lambda we could but them into a directory with our index.js
(the actual function script) then zip and upload that. There are a couple downsides to this. On Windows zipping the executable files in Windows Explorer stripped the permissions and caused errors when attempting to invoke the executable files my function. Also, every time I made a change in my script I had to re-upload a 40MB file. This is horribly slow and consumes data transfer credit. Not ideal for development and data transfer can cost 💲. The first part of the solution to this problem is to use a Lambda Layer
.
Lambda Layers
A Lambda Layer can hold additional code in the form of libraries, custom runtimes or other dependencies. Once we establish a Layer
it can be used in multiple functions and can be edited and saved in multiple versions. Very flexible.
First, we need to place our ffmpeg
and ffprobe
files into a folder called nodejs
- the name is important. I ended up using Windows Subsystem for Linux and the zip
command to compress the nodejs
folder. This was the easiest way I found to preserve the proper permissions.
From the parent directory of our nodejs
folder, I run:
zip -r ./layer.zip nodejs
The -r
is to recursively zip the contents of nodejs
into a new file called layer.zip
.
From the Lambda console click the Layers
tab and create a new layer. When you create your Layer
make sure to set Node.js 8.10 as a compatable runtime. Now you can go back to the function configuration and add our new Layer
to createThumbnail
.
Finally, we get to the code. 😲
Disclaimer
If someone sees anything that could be better here please comment and let me know. It took me a while to cobble all these ideas together from various corners of the net and this is the first time I have used Lambda. What I'm saying is I am no expert, but finding an article like this when I started would have been helpful.
Code
Since we took the time to set up a Layer and our code has no other dependencies we can type our code directly into the inline editor. I made my local copy in VSCode
just to have a my preferred editor settings, then copy and pasted.
First we need to require some of the stuff we need. The aws-sdk
is available in the environment. child_process
and fs
are Node modules.
const AWS = require('aws-sdk')
const { spawnSync, spawn } = require('child_process')
const { createReadStream, createWriteStream } = require('fs')
spawn
and spawnSync
will allow us to run our executable files from within the Node environment as child processes.
The Lambda environment provides a /tmp
directory to use as we wish. We will stream our image data from ffmpeg
into /tmp
and then read from there when we upload our thumbnails.
Now we can define some variables we will use later.
const s3 = new AWS.S3()
const ffprobePath = '/opt/nodejs/ffprobe'
const ffmpegPath = '/opt/nodejs/ffmpeg'
const allowedTypes = ['mov', 'mpg', 'mpeg', 'mp4', 'wmv', 'avi', 'webm']
const width = process.env.WIDTH
const height = process.env.HEIGHT
We create our S3 instance to interact with our bucket. Since we are using a Layer
the paths to our executable files are located in the /opt/nodejs
directory. We define an array of allowed types. Settings for width
and height
can be set as environment variables from the Lambda console. I used 200x112.
Our actual function is written in standard Node format and must be called handler
. A custom name can be set in the console.
module.exports.handler = async (event, context) => {
const srcKey = decodeURIComponent(event.Records[0].s3.object.key).replace(/\+/g, ' ')
const bucket = event.Records[0].s3.bucket.name
const target = s3.getSignedUrl('getObject', { Bucket: bucket, Key: srcKey, Expires: 1000 })
let fileType = srcKey.match(/\.\w+$/)
if (!fileType) {
throw new Error(`invalid file type found for key: ${srcKey}`)
}
fileType = fileType[0].slice(1)
if (allowedTypes.indexOf(fileType) === -1) {
throw new Error(`filetype: ${fileType} is not an allowed type`)
}
// to be continued
}
We will make our function async
so we can compose our asynchronous code in a way that appears synchronous. First we parse the srcKey
from the event passed in from Lambda. This is the filename of our video without the bucket url. We also grab the bucket name. We can save our images to the same bucket as our video if we set our event listener up such that our function won't fire when they are uploaded. We then isolate the file extension and run some checks to make sure it is valid before continuing.
// inside handler function
const ffprobe = spawnSync(ffprobePath, [
'-v',
'error',
'-show_entries',
'format=duration',
'-of',
'default=nw=1:nk=1',
target
])
const duration = Math.ceil(ffprobe.stdout.toString())
Here we use spawnSync
to run ffprobe
and get the duration
of the video from the stdout
. Use toString
because the output is Buffered. By having the duration we can capture our thumbnails in a targeted way throughout the video. I thought taking a thumbnail at 25%, 50% and 75% was a reasonable way to go about getting 3. Of course, with the following functions you can take as many thumbnails as needed. ffprobe
can also report much more data than duration, but that is all we are concerned with here.
function createImage(seek) {
return new Promise((resolve, reject) => {
let tmpFile = createWriteStream(`/tmp/screenshot.jpg`)
const ffmpeg = spawn(ffmpegPath, [
'-ss',
seek,
'-i',
target,
'-vf',
`thumbnail,scale=${width}:${height}`,
'-qscale:v',
'2',
'-frames:v',
'1',
'-f',
'image2',
'-c:v',
'mjpeg',
'pipe:1'
])
ffmpeg.stdout.pipe(tmpFile)
ffmpeg.on('close', function(code) {
tmpFile.end()
resolve()
})
ffmpeg.on('error', function(err) {
console.log(err)
reject()
})
})
}
There is a lot going on here. The function takes a seek
parameter. With this in place we can enter Math.round(duration * .25)
for example. The -ss
flag followed by time in seconds will seek the video to this spot before taking our thumbnail. We reference target
which is our video file. We specify the dimensions we want to use, the quality, frames and format, then finally we pipe the output into a writeStream
that is writing to the /tmp
directory. All of this is wrapped in a Promise
that resolves when this child_process
closes.
Understanding exactly what each ffmpeg
input does is mad confusing but the ffmpeg Documentation is decent and there are a lot of forum posts out there as well. The bottom line is we have a reusable function that lets us take a thumbnail whenever we want. It also works well in our async/await
flow.
function uploadToS3(x) {
return new Promise((resolve, reject) => {
let tmpFile = createReadStream(`/tmp/screenshot.jpg`)
let dstKey = srcKey.replace(/\.\w+$/, `-${x}.jpg`).replace('/videos/', '/thumbnails/')
var params = {
Bucket: bucket,
Key: dstKey,
Body: tmpFile,
ContentType: `image/jpg`
}
s3.upload(params, function(err, data) {
if (err) {
console.log(err)
reject()
}
console.log(`successful upload to ${bucket}/${dstKey}`)
resolve()
})
})
}
Now we write a reusable function that will upload thumbnail images to an S3 bucket. Since I used prefix
and suffix
filters and I am uploading video files to /user/videos
I can just replace videos
with thumbnails
and my function won't be triggered. You can put in any dstKey
and bucket
that you want. Again we are wrapping our function in a Promise
to help with our async flow.
So our final code might look something like this:
process.env.PATH = process.env.PATH + ':' + process.env['LAMBDA_TASK_ROOT']
const AWS = require('aws-sdk')
const { spawn, spawnSync } = require('child_process')
const { createReadStream, createWriteStream } = require('fs')
const s3 = new AWS.S3()
const ffprobePath = '/opt/nodejs/ffprobe'
const ffmpegPath = '/opt/nodejs/ffmpeg'
const allowedTypes = ['mov', 'mpg', 'mpeg', 'mp4', 'wmv', 'avi', 'webm']
const width = process.env.WIDTH
const height = process.env.HEIGHT
}
module.exports.handler = async (event, context) => {
const srcKey = decodeURIComponent(event.Records[0].s3.object.key).replace(/\+/g, ' ')
const bucket = event.Records[0].s3.bucket.name
const target = s3.getSignedUrl('getObject', { Bucket: bucket, Key: srcKey, Expires: 1000 })
let fileType = srcKey.match(/\.\w+$/)
if (!fileType) {
throw new Error(`invalid file type found for key: ${srcKey}`)
}
fileType = fileType[0].slice(1)
if (allowedTypes.indexOf(fileType) === -1) {
throw new Error(`filetype: ${fileType} is not an allowed type`)
}
function createImage(seek) {
return new Promise((resolve, reject) => {
let tmpFile = createWriteStream(`/tmp/screenshot.jpg`)
const ffmpeg = spawn(ffmpegPath, [
'-ss',
seek,
'-i',
target,
'-vf',
`thumbnail,scale=${width}:${height}`,
'-qscale:v',
'2',
'-frames:v',
'1',
'-f',
'image2',
'-c:v',
'mjpeg',
'pipe:1'
])
ffmpeg.stdout.pipe(tmpFile)
ffmpeg.on('close', function(code) {
tmpFile.end()
resolve()
})
ffmpeg.on('error', function(err) {
console.log(err)
reject()
})
})
}
function uploadToS3(x) {
return new Promise((resolve, reject) => {
let tmpFile = createReadStream(`/tmp/screenshot.jpg`)
let dstKey = srcKey.replace(/\.\w+$/, `-${x}.jpg`).replace('/videos/', '/thumbnails/')
var params = {
Bucket: bucket,
Key: dstKey,
Body: tmpFile,
ContentType: `image/jpg`
}
s3.upload(params, function(err, data) {
if (err) {
console.log(err)
reject()
}
console.log(`successful upload to ${bucket}/${dstKey}`)
resolve()
})
})
}
const ffprobe = spawnSync(ffprobePath, [
'-v',
'error',
'-show_entries',
'format=duration',
'-of',
'default=nw=1:nk=1',
target
])
const duration = Math.ceil(ffprobe.stdout.toString())
await createImage(duration * 0.25)
await uploadToS3(1)
await createImage(duration * .5)
await uploadToS3(2)
await createImage(duration * .75)
await uploadToS3(3)
return console.log(`processed ${bucket}/${srcKey} successfully`)
}
Tips
Lambda allows you to allocate a set amount of memory to your function. I am using 512MB and everything seems to be running well. My function is doing a couple more things that described here and uses around 400MB per invocation.
Utilize the CloudWatch logs and the monitoring graphs provided by AWS. My function averages about 12 seconds per invocation. Note that I have a ton of errors on this graph as I attempted to refactor things(all the green dots at the bottom).
- This version of the code has no contact with the application from which the original video is uploaded. Solutions to this are to send a
POST
request from the Lambda function to your backend when the processing is complete. Another option I found is that adding 20 seconds delay to my video upload gives ample time for the thumbnails to be created. When uploading the video we know where its going so we know the url it will eventually have. Since we are building our thumbnail keys based on the original video key we know what those urls will be as well.
const videoUrl = 'https://s3-us-west-1.amazonaws.com/footube/user/videos/example.mp4'
const imageUrl = 'https://s3-us-west-1.amazonaws.com/footube/user/thumbnails/example-1.jpg'
Notice that I allow an extra 20 seconds for processing before I show the thumbnails.
-
ffmpeg
can do much more. It can convert formats. It can even generate a preview GIF like what you see on YouTube when you hover a video thumbnail.
Resources
Articles I found helpful.
- Creating video thumbnails with AWS Lambda in your s3 Bucket
- ffprobe tips
- NodeJS Runtime Environment with AWS Lambda Layers
- AWS Lambda Documentation
Conclusion
This article ended up way longer that I thought it would. I wanted to give a comprehensive view of how to set this thing up. If I left something out or got something wrong please let me know.
Top comments (15)
Hey ben,
Its nice to see someone refactored my code, especially with the newish layer system, but theres a couple of things you can do that will fit your use case better.
First in my original I use jpg and -ss because I only needed one screenshot and it had to be jpg because of the file system we were using at the time.
You would be better to change to output codec to png - you can just remove the vcodec (theres also another line in there I left in which doesnt do anything for your use case).
For permissions you can move the ffmpeg/ffprobe to /tmp and then run chmod there but you need to update you ffmpeg path.
Actually, in your use case its actually better not to use any of this, it would be better to use something like aws elemental media convert to generate thumbnails/gifs etc.
Though if you wanted to use this function, its better to use SNS to push to your application when its completed. So send to SNS after completing upload - SNS pushes to your application/sends email w/e.
You could also be crazy and use AWS Rekognition to autogenerate tags for the videos.
With your current setup you may run into memory problems seeking 75% on larger files, whats your current largest test files? I assume in your current setup you also use ffprobe to output JSON to get the metadata right?
Its good to see a refactored version of my code for a fake youtube. Never though of using it like that.
Regarding the memory problem:
I wanted to implement this for one of my use cases and was worried about the memory issues too.
But this article here explains how the FFmpeg works with URL and seek. wistia.com/learn/marketing/faster-...
And looks like we are good?
Thanks for your article, it was helpful and the best out there on this topic. I added a Resources section to give credit where credit's due.
I did some testing with PNG and I was surprised the PNGs were larger. I've been under the general impression that PNG for graphics, JPG for photos.
JPG ~ 12KB
PNG ~ 61KB
Thanks for the tip on SNS, it looks very useful. I would have to do more testing and use some larger files, I think I largest video I tested with so far was only 15MB.
I figured trying to recreate what I could of YouTube would be a good way to learn stuff. So far it has. Obviously, their code is complex, but its fun to come with a function to determine what video to play next for example.
Hey Derek:
For the output codec, why do you suggest using png?
Hey Ben
Nice guide - really useful for what i am coding now.
I pretty much copy-pasted the code after fitting variables to my needs.
I have an issue though. When i test the function the reports all well and green lights all the way, and it does generate the images, but the images are 0 byte size and empty.
Do you have any idea what could be wrong?
Cheers
hey, did you ever figure out the solution to this? it's uploading an empty file for me and not showing any error :/
Hi Ben,
it was a great article.
I reproduced it using docker and S3. No lambda.
I will use your post to write an article on how to do this in a highly available manner.
Depending on the number of images requested, I generate the requested screenshots.
If the duration of the video = 19 seconds and you request 3 images, that makes (19-1) / 3 = 7
image 1 = 7,
image 2 = 7 + 7 = 14,
image 3 = 7 + 7 + 7 = 21
Image 3 is 21 seconds and 21 seconds longer than the duration of the video, it will be ignored and therefore we only generate frame 1 and frame 2.
To avoid image name conflicts generated on disk, I generate my own file names.
const fileNameOnDisk = Math.floor (1000000000000 + Math.random () * 9000000000000);
When the image is generated and uploaded to AWS, I delete the files from disk asynchronously.
I added time markers to detect excessively long executions. We have three important stages:
1- Read the file with ffprobe to get the details of the file, mainly the duration of the video. (maximum 30 seconds)
2- Use ffmpeg to generate the screenshots (maximum 3 seconds per image)
3- Upload the files to s3 (maximum 5 seconds per image)
You should put markers on these three steps and therefore trigger the email log when it exceeds your maximum time.
This allows you to be proactive on issues.
Hey Ben,
Thanks for the post. I get 3 jpg files in my bucket's 'cache' folder which are black frames with the following striped across the top.
The image “myapp.s3-us-west-1.amazonaws.com/c... cannot be displayed because it contains errors."
No thumbnail folder is created.
Any thoughts?
Thanks!
Thx!
I'm not sure. I guess the first thing i might try is logging the srcKey and dstKey variables since the dst key should be creating the thumbnails folder when an image is uploaded. Put the console.log statements in a then you can check them in the cloud watch reports. I copied my exact code to a gist just i case there was some kind of typo or something.
Thanks for the response. I ended up using my web server & ffmpeg to create the thumbs and upload them. You have inspired me to learn more about Lambda, however! Thx
Hi Ben,
Thank you for the article.
I keep facing the same issue with the permission for the binary file.
Instead of linux distribution I used windows zip to zip the file for creating layers.
START RequestId: f1840de8-2d36-4453-98ff-d0d6b66b88db Version: $LATEST
2020-04-07T13:56:38.652Z f1840de8-2d36-4453-98ff-d0d6b66b88db INFO ffmpegPath /opt/nodejs/ffmpeg
2020-04-07T13:56:38.773Z f1840de8-2d36-4453-98ff-d0d6b66b88db INFO Error: spawn /opt/nodejs/ffmpeg EACCES
at Process.ChildProcess._handle.onexit (internal/child_process.js:267:19)
at onErrorNT (internal/child_process.js:469:16)
at processTicksAndRejections (internal/process/task_queues.js:84:21) {
errno: 'EACCES',
code: 'EACCES',
syscall: 'spawn /opt/nodejs/ffmpeg',
path: '/opt/nodejs/ffmpeg',
spawnargs: [
'-y',
'-ss',
5,
'-t',
3,
'-i',
'thumbnail-test-zuku.s3.amazonaws.c...
'fps=10,scale=320👎flags=lanczos,split [o1] [o2];[o1] palettegen [p]; [o2] fifo [o3];[o3] [p] paletteuse',
'-f',
'gif',
'pipe:1'
]
}
I had similar problems using the Windows zip. I ended up using the Windows Subsystem For Linux environment to zip any files. There is a command line zip utility. This resolved the errors.
I was also having the problem of the images being written coming out at 0 bytes. Also, regarding creating the layer, you don't need to package up ffmpeg yourself as you can just deploy a pre-made layer from the AWS SAR.
My solution ended up being quite different from that outlined here but your article did get me started; for others still having issues I have written a tutorial on my own blog which documents how I got it working.
Hi Ben,
it was a great article.
I reproduced it using docker and S3. No lambda.
I will use your post to write an article on how to do this in a highly available manner.
Depending on the number of images requested, I generate the requested screenshots.
If the duration of the video = 19 seconds and you request 3 images, that makes (19) / 3 = 7
image 1 = 7,
image 2 = 7 + 7 = 14,
image 3 = 7 + 7 + 7 = 21
Frame 3 is 21 seconds and 21 seconds longer than the video length, it will be skipped and so we generate only frame 1 and frame 2.
To avoid image name conflicts generated on disk, I generate my own file names.
const fileNameOnDisk = Math.floor (1000000000000 + Math.random () * 9000000000000);
When the image is generated and uploaded to AWS, I delete the files from disk asynchronously.
I added time markers to detect excessively long executions. We have three important stages:
1- Read the file with ffprobe to get the details of the file, mainly the duration of the video. (maximum 30 seconds)
2- Use ffmpeg to generate the screenshots (maximum 3 seconds per image)
3- Upload the files to s3 (maximum 5 seconds per image)
You should put markers on these three steps and therefore trigger the email log when it exceeds your maximum time.
This allows you to be proactive on all issues.
can you give me your aws lambda ffmpeg ARN number ??