benjaminadk

Posted on Feb 23, 2019 • Edited on Dec 20, 2020

How do I create thumbnails when I upload a video? aws lambda!

#aws #javascript #node #react

Good question. 🤔

Introduction

I have been playing around with a YouTube clone I call FooTube. I had set up video uploads to be sent from the browser to an AWS S3 bucket, so the video file did not touch my node backend This made server-side video processing a non-starter. This put me in a dilemma because I wanted to generate 3 thumbnails for each video upload like the real YouTube does. I started thinking about creating a video player off-screen and using canvas to stream things around. While that might be possible, it didn't sound like fun, and thats not what I ended up doing.

The research began.

I discovered that YouTube uses deep neural networks to pick out thumbnails that display a subject or a face or something else that draws attention. They also capture a thumbnail for every second of video and use an algorithm to rank each one. This interesting article written by the YouTube Creator team from 2015 explains further. At this point I decided that just getting 3 thumbnail images would enough of a challenge for me - since I still had no clue what I was doing. 🤦‍♂️

Companion Video

Disclaimer

Please keep in mind this code is NOT meant to be a production ready solution, it is more an exploration or proof of concept. There are a lot of moving parts, and while I have managed to get this working in my local environment, I simply cannot guarantee it will work anywhere else! Sorry.

Lambda Functions

The first thing I found out was that I could use AWS Lambda to sort of outsource computations that might normally take place on a server. As a bonus, since I was already using S3, I could attach what amounts to an event listener to trigger my Lambda function when I uploaded a video file.

Creating a new Lambda function is straightforward. When prompted you want to chose create a function from scratch and come up with a decent name; createThumbail worked for me. Also, select the Node.js 8.10 runtime.

IAM Role Permissions

I had to create a new IAM role to execute this function. This can be done through a simple work flow in the IAM console. Name the role whatever you want but give it the AWSLambdaExecute permission. This will allow for PUT and GET access to S3 and full access to CloudWatch Logs. These are all the permissions we need to execute and monitor our createThumbnail Lambda function. I had to add the arn for this role to my bucket policy.



        {
            "Sid": "Stmt**************",
            "Effect": "Allow",
            "Principal": {
                "AWS": [
                    "arn:aws:iam::**********:role/LambdaRole"
                ]
            },
            "Action": [
                "s3:GetObject",
                "s3:PutObject"
            ],
            "Resource": "arn:aws:s3:::bucket/*"
        }

Triggers

Next we need to configure the trigger for our function. We want to listen to the bucket we are uploading videos to and watch for the PUT method since that is the method used to send the video. Optionally, you can set a prefix and/or suffix to narrow down the trigger. My function saves the thumbnails to this same bucket. In this case you might use a suffix of mp4 or webm (video formats). My videos were going to the user folder so I set a prefix of user/ since this would be at the beginning of any key.

Once your function is created and its trigger configured, these settings will show up in the S3 bucket referenced by said trigger. In fact they can be set from either S3 or Lambda consoles. Click the Properties tab then the Events box in the S3 console to view events associated with a bucket.

Getting Code to Lambda

There are a few ways to get code into our Lambda function. AWS provides a online code editor if your package size less than 3MB. You can also upload a package in the form of a zip file directly to Lambda or upload a zip file to S3 and then link that to your function. This zip format allows multiple files to be included in your bundle, including typical node_modules dependencies as well as executable files.

In fact, we are going to utilize a couple executable files to help process our video. ffmpeg is a command line tool to convert multimedia files and ffprobe is a stream analyzer. You might have these tools installed locally but we need to use static builds on Lambda. Download choices can be found here. I chose https://johnvansickle.com/ffmpeg/releases/ffmpeg-release-amd64-static.tar.xz. To unpack the compressed contents I used 7-Zip. Once unpacked we want to isolate the files ffmpeg and ffprobe, go figure.

Note that user,group and global all have read/execute permissions. I am on Windows and had a problem keeping these permissions. Lambda permissions are a little tricky, and global read is important for all files. On Windows the problem arose when I attempted the next step.

To get our executable files to Lambda we could but them into a directory with our index.js (the actual function script) then zip and upload that. There are a couple downsides to this. On Windows zipping the executable files in Windows Explorer stripped the permissions and caused errors when attempting to invoke the executable files my function. Also, every time I made a change in my script I had to re-upload a 40MB file. This is horribly slow and consumes data transfer credit. Not ideal for development and data transfer can cost 💲. The first part of the solution to this problem is to use a Lambda Layer.

Lambda Layers

A Lambda Layer can hold additional code in the form of libraries, custom runtimes or other dependencies. Once we establish a Layer it can be used in multiple functions and can be edited and saved in multiple versions. Very flexible.

First, we need to place our ffmpeg and ffprobe files into a folder called nodejs - the name is important. I ended up using Windows Subsystem for Linux and the zip command to compress the nodejs folder. This was the easiest way I found to preserve the proper permissions.

From the parent directory of our nodejs folder, I run:



zip -r ./layer.zip nodejs

The -r is to recursively zip the contents of nodejs into a new file called layer.zip.

From the Lambda console click the Layers tab and create a new layer. When you create your Layer make sure to set Node.js 8.10 as a compatable runtime. Now you can go back to the function configuration and add our new Layer to createThumbnail.

Finally, we get to the code. 😲

Disclaimer

If someone sees anything that could be better here please comment and let me know. It took me a while to cobble all these ideas together from various corners of the net and this is the first time I have used Lambda. What I'm saying is I am no expert, but finding an article like this when I started would have been helpful.

Code

Since we took the time to set up a Layer and our code has no other dependencies we can type our code directly into the inline editor. I made my local copy in VSCode just to have a my preferred editor settings, then copy and pasted.

First we need to require some of the stuff we need. The aws-sdk is available in the environment. child_process and fs are Node modules.



const AWS = require('aws-sdk')
const { spawnSync, spawn } = require('child_process')
const { createReadStream, createWriteStream } = require('fs')

spawn and spawnSync will allow us to run our executable files from within the Node environment as child processes.

The Lambda environment provides a /tmp directory to use as we wish. We will stream our image data from ffmpeg into /tmp and then read from there when we upload our thumbnails.

Now we can define some variables we will use later.



const s3 = new AWS.S3()
const ffprobePath = '/opt/nodejs/ffprobe'
const ffmpegPath = '/opt/nodejs/ffmpeg'
const allowedTypes = ['mov', 'mpg', 'mpeg', 'mp4', 'wmv', 'avi', 'webm']
const width = process.env.WIDTH
const height = process.env.HEIGHT

We create our S3 instance to interact with our bucket. Since we are using a Layer the paths to our executable files are located in the /opt/nodejs directory. We define an array of allowed types. Settings for width and height can be set as environment variables from the Lambda console. I used 200x112.

Our actual function is written in standard Node format and must be called handler. A custom name can be set in the console.



module.exports.handler = async (event, context) => {
  const srcKey = decodeURIComponent(event.Records[0].s3.object.key).replace(/\+/g, ' ')
  const bucket = event.Records[0].s3.bucket.name
  const target = s3.getSignedUrl('getObject', { Bucket: bucket, Key: srcKey, Expires: 1000 })
  let fileType = srcKey.match(/\.\w+$/)

  if (!fileType) {
    throw new Error(`invalid file type found for key: ${srcKey}`)
  }

  fileType = fileType[0].slice(1)

  if (allowedTypes.indexOf(fileType) === -1) {
    throw new Error(`filetype: ${fileType} is not an allowed type`)
  }

    // to be continued
}

We will make our function async so we can compose our asynchronous code in a way that appears synchronous. First we parse the srcKey from the event passed in from Lambda. This is the filename of our video without the bucket url. We also grab the bucket name. We can save our images to the same bucket as our video if we set our event listener up such that our function won't fire when they are uploaded. We then isolate the file extension and run some checks to make sure it is valid before continuing.



// inside handler function

  const ffprobe = spawnSync(ffprobePath, [
    '-v',
    'error',
    '-show_entries',
    'format=duration',
    '-of',
    'default=nw=1:nk=1',
    target
  ])

  const duration = Math.ceil(ffprobe.stdout.toString())

Here we use spawnSync to run ffprobe and get the duration of the video from the stdout. Use toString because the output is Buffered. By having the duration we can capture our thumbnails in a targeted way throughout the video. I thought taking a thumbnail at 25%, 50% and 75% was a reasonable way to go about getting 3. Of course, with the following functions you can take as many thumbnails as needed. ffprobe can also report much more data than duration, but that is all we are concerned with here.



  function createImage(seek) {
    return new Promise((resolve, reject) => {
      let tmpFile = createWriteStream(`/tmp/screenshot.jpg`)
      const ffmpeg = spawn(ffmpegPath, [
        '-ss',
        seek,     
        '-i',
        target,   
        '-vf',
        `thumbnail,scale=${width}:${height}`,
        '-qscale:v',
        '2',
        '-frames:v',
        '1',
        '-f',
        'image2',
        '-c:v',
        'mjpeg',
        'pipe:1'  
      ])

      ffmpeg.stdout.pipe(tmpFile)

      ffmpeg.on('close', function(code) {
        tmpFile.end()
        resolve()
      })

      ffmpeg.on('error', function(err) {
        console.log(err)
        reject()
      })
    })
  }

There is a lot going on here. The function takes a seek parameter. With this in place we can enter Math.round(duration * .25) for example. The -ss flag followed by time in seconds will seek the video to this spot before taking our thumbnail. We reference target which is our video file. We specify the dimensions we want to use, the quality, frames and format, then finally we pipe the output into a writeStream that is writing to the /tmp directory. All of this is wrapped in a Promise that resolves when this child_process closes.

Understanding exactly what each ffmpeg input does is mad confusing but the ffmpeg Documentation is decent and there are a lot of forum posts out there as well. The bottom line is we have a reusable function that lets us take a thumbnail whenever we want. It also works well in our async/await flow.



  function uploadToS3(x) {
    return new Promise((resolve, reject) => {
      let tmpFile = createReadStream(`/tmp/screenshot.jpg`)
      let dstKey = srcKey.replace(/\.\w+$/, `-${x}.jpg`).replace('/videos/', '/thumbnails/')

      var params = {
        Bucket: bucket,
        Key: dstKey,
        Body: tmpFile,
        ContentType: `image/jpg`
      }

      s3.upload(params, function(err, data) {
        if (err) {
          console.log(err)
          reject()
        }
        console.log(`successful upload to ${bucket}/${dstKey}`)
        resolve()
      })
    })
  }

Now we write a reusable function that will upload thumbnail images to an S3 bucket. Since I used prefix and suffix filters and I am uploading video files to /user/videos I can just replace videos with thumbnails and my function won't be triggered. You can put in any dstKey and bucket that you want. Again we are wrapping our function in a Promise to help with our async flow.

So our final code might look something like this:



process.env.PATH = process.env.PATH + ':' + process.env['LAMBDA_TASK_ROOT']

const AWS = require('aws-sdk')
const { spawn, spawnSync } = require('child_process')
const { createReadStream, createWriteStream } = require('fs')

const s3 = new AWS.S3()
const ffprobePath = '/opt/nodejs/ffprobe'
const ffmpegPath = '/opt/nodejs/ffmpeg'
const allowedTypes = ['mov', 'mpg', 'mpeg', 'mp4', 'wmv', 'avi', 'webm']
const width = process.env.WIDTH
const height = process.env.HEIGHT
}

module.exports.handler = async (event, context) => {
  const srcKey = decodeURIComponent(event.Records[0].s3.object.key).replace(/\+/g, ' ')
  const bucket = event.Records[0].s3.bucket.name
  const target = s3.getSignedUrl('getObject', { Bucket: bucket, Key: srcKey, Expires: 1000 })
  let fileType = srcKey.match(/\.\w+$/)

  if (!fileType) {
    throw new Error(`invalid file type found for key: ${srcKey}`)
  }

  fileType = fileType[0].slice(1)

  if (allowedTypes.indexOf(fileType) === -1) {
    throw new Error(`filetype: ${fileType} is not an allowed type`)
  }

  function createImage(seek) {
    return new Promise((resolve, reject) => {
      let tmpFile = createWriteStream(`/tmp/screenshot.jpg`)
      const ffmpeg = spawn(ffmpegPath, [
        '-ss',
        seek,
        '-i',
        target,
        '-vf',
        `thumbnail,scale=${width}:${height}`,
        '-qscale:v',
        '2',
        '-frames:v',
        '1',
        '-f',
        'image2',
        '-c:v',
        'mjpeg',
        'pipe:1'
      ])

      ffmpeg.stdout.pipe(tmpFile)

      ffmpeg.on('close', function(code) {
        tmpFile.end()
        resolve()
      })

      ffmpeg.on('error', function(err) {
        console.log(err)
        reject()
      })
    })
  }

  function uploadToS3(x) {
    return new Promise((resolve, reject) => {
      let tmpFile = createReadStream(`/tmp/screenshot.jpg`)
      let dstKey = srcKey.replace(/\.\w+$/, `-${x}.jpg`).replace('/videos/', '/thumbnails/')

      var params = {
        Bucket: bucket,
        Key: dstKey,
        Body: tmpFile,
        ContentType: `image/jpg`
      }

      s3.upload(params, function(err, data) {
        if (err) {
          console.log(err)
          reject()
        }
        console.log(`successful upload to ${bucket}/${dstKey}`)
        resolve()
      })
    })
  }

  const ffprobe = spawnSync(ffprobePath, [
    '-v',
    'error',
    '-show_entries',
    'format=duration',
    '-of',
    'default=nw=1:nk=1',
    target
  ])

  const duration = Math.ceil(ffprobe.stdout.toString())

  await createImage(duration * 0.25)
  await uploadToS3(1)
  await createImage(duration * .5)
  await uploadToS3(2)
  await createImage(duration * .75)
  await uploadToS3(3)

  return console.log(`processed ${bucket}/${srcKey} successfully`)
}

Tips

Lambda allows you to allocate a set amount of memory to your function. I am using 512MB and everything seems to be running well. My function is doing a couple more things that described here and uses around 400MB per invocation.
Utilize the CloudWatch logs and the monitoring graphs provided by AWS. My function averages about 12 seconds per invocation. Note that I have a ton of errors on this graph as I attempted to refactor things(all the green dots at the bottom).

This version of the code has no contact with the application from which the original video is uploaded. Solutions to this are to send a POST request from the Lambda function to your backend when the processing is complete. Another option I found is that adding 20 seconds delay to my video upload gives ample time for the thumbnails to be created. When uploading the video we know where its going so we know the url it will eventually have. Since we are building our thumbnail keys based on the original video key we know what those urls will be as well.



const videoUrl = 'https://s3-us-west-1.amazonaws.com/footube/user/videos/example.mp4'

const imageUrl = 'https://s3-us-west-1.amazonaws.com/footube/user/thumbnails/example-1.jpg'

Notice that I allow an extra 20 seconds for processing before I show the thumbnails.

ffmpeg can do much more. It can convert formats. It can even generate a preview GIF like what you see on YouTube when you hover a video thumbnail.

Resources

Articles I found helpful.

Conclusion

This article ended up way longer that I thought it would. I wanted to give a comprehensive view of how to set this thing up. If I left something out or got something wrong please let me know.

Top comments (15)

Derek Cameron • Mar 15 '19 • Edited

Hey ben,

Its nice to see someone refactored my code, especially with the newish layer system, but theres a couple of things you can do that will fit your use case better.

First in my original I use jpg and -ss because I only needed one screenshot and it had to be jpg because of the file system we were using at the time.

You would be better to change to output codec to png - you can just remove the vcodec (theres also another line in there I left in which doesnt do anything for your use case).

For permissions you can move the ffmpeg/ffprobe to /tmp and then run chmod there but you need to update you ffmpeg path.

Solutions to this are to send a POST request from the Lambda function to your backend when the processing is complete.

Actually, in your use case its actually better not to use any of this, it would be better to use something like aws elemental media convert to generate thumbnails/gifs etc.

Though if you wanted to use this function, its better to use SNS to push to your application when its completed. So send to SNS after completing upload - SNS pushes to your application/sends email w/e.

You could also be crazy and use AWS Rekognition to autogenerate tags for the videos.

With your current setup you may run into memory problems seeking 75% on larger files, whats your current largest test files? I assume in your current setup you also use ffprobe to output JSON to get the metadata right?

Its good to see a refactored version of my code for a fake youtube. Never though of using it like that.

Rohit Shetty • Aug 3 '19

Regarding the memory problem:

I wanted to implement this for one of my use cases and was worried about the memory issues too.
But this article here explains how the FFmpeg works with URL and seek. wistia.com/learn/marketing/faster-...

And looks like we are good?

benjaminadk • Mar 15 '19

Thanks for your article, it was helpful and the best out there on this topic. I added a Resources section to give credit where credit's due.

I did some testing with PNG and I was surprised the PNGs were larger. I've been under the general impression that PNG for graphics, JPG for photos.

JPG ~ 12KB

PNG ~ 61KB

Thanks for the tip on SNS, it looks very useful. I would have to do more testing and use some larger files, I think I largest video I tested with so far was only 15MB.

I figured trying to recreate what I could of YouTube would be a good way to learn stuff. So far it has. Obviously, their code is complex, but its fun to come with a function to determine what video to play next for example.

jackson007 • Mar 8 '20

Hey Derek:
For the output codec, why do you suggest using png?

Andreas Piculell • Oct 20 '19

Hey Ben

Nice guide - really useful for what i am coding now.

I pretty much copy-pasted the code after fitting variables to my needs.

I have an issue though. When i test the function the reports all well and green lights all the way, and it does generate the images, but the images are 0 byte size and empty.

Do you have any idea what could be wrong?

Cheers

Will Lowry • Apr 11 '22

hey, did you ever figure out the solution to this? it's uploading an empty file for me and not showing any error :/

emmano3h • Jan 18 '21

Hi Ben,
it was a great article.
I reproduced it using docker and S3. No lambda.
I will use your post to write an article on how to do this in a highly available manner.

I made a non-blocking web service which takes as parameter the url of the file on aws (use a cloudfront link, that of s3 does not work with ffmpeg), the number of images to generate, the width and the length pictures to be produced. The web service receives the request, places it on redis workers (Pub / sub mode), and sends an immediate "scheduled" response. This makes it possible not to maintain the connection for a long time and to free the resource. It processes the request on the server workers, then sends the results back through SNS or a web socket (if that's your choice).
I do not authorize the processing of videos shorter than 4 seconds and longer than 1 hour. 4 seconds avoids generating empty or white images. 1h is to check the consumed resources.
If the video lasts 19 seconds, put a check to avoid generating the image at second 19. It will be an empty image.
Depending on the number of images requested, I generate the requested screenshots.
If the duration of the video = 19 seconds and you request 3 images, that makes (19-1) / 3 = 7
image 1 = 7,
image 2 = 7 + 7 = 14,
image 3 = 7 + 7 + 7 = 21
Image 3 is 21 seconds and 21 seconds longer than the duration of the video, it will be ignored and therefore we only generate frame 1 and frame 2.
To avoid image name conflicts generated on disk, I generate my own file names.
const fileNameOnDisk = Math.floor (1000000000000 + Math.random () * 9000000000000);
When the image is generated and uploaded to AWS, I delete the files from disk asynchronously.
I added time markers to detect excessively long executions. We have three important stages:
1- Read the file with ffprobe to get the details of the file, mainly the duration of the video. (maximum 30 seconds)
2- Use ffmpeg to generate the screenshots (maximum 3 seconds per image)
3- Upload the files to s3 (maximum 5 seconds per image)
You should put markers on these three steps and therefore trigger the email log when it exceeds your maximum time.
This allows you to be proactive on issues.

Jeff Davidson • Jun 9 '19 • Edited

Hey Ben,

Thanks for the post. I get 3 jpg files in my bucket's 'cache' folder which are black frames with the following striped across the top.

The image “myapp.s3-us-west-1.amazonaws.com/c... cannot be displayed because it contains errors."

No thumbnail folder is created.

Any thoughts?

Thanks!

Thx!

benjaminadk • Jun 15 '19

I'm not sure. I guess the first thing i might try is logging the srcKey and dstKey variables since the dst key should be creating the thumbnails folder when an image is uploaded. Put the console.log statements in a then you can check them in the cloud watch reports. I copied my exact code to a gist just i case there was some kind of typo or something.

Jeff Davidson • Jun 18 '19

Thanks for the response. I ended up using my web server & ffmpeg to create the thumbs and upload them. You have inspired me to learn more about Lambda, however! Thx

vinz2714 • Apr 7 '20

Hi Ben,

Thank you for the article.
I keep facing the same issue with the permission for the binary file.
Instead of linux distribution I used windows zip to zip the file for creating layers.
START RequestId: f1840de8-2d36-4453-98ff-d0d6b66b88db Version: $LATEST
2020-04-07T13:56:38.652Z f1840de8-2d36-4453-98ff-d0d6b66b88db INFO ffmpegPath /opt/nodejs/ffmpeg
2020-04-07T13:56:38.773Z f1840de8-2d36-4453-98ff-d0d6b66b88db INFO Error: spawn /opt/nodejs/ffmpeg EACCES
at Process.ChildProcess._handle.onexit (internal/child_process.js:267:19)
at onErrorNT (internal/child_process.js:469:16)
at processTicksAndRejections (internal/process/task_queues.js:84:21) {
errno: 'EACCES',
code: 'EACCES',
syscall: 'spawn /opt/nodejs/ffmpeg',
path: '/opt/nodejs/ffmpeg',
spawnargs: [
'-y',
'-ss',
5,
'-t',
3,
'-i',
'thumbnail-test-zuku.s3.amazonaws.c...
'fps=10,scale=320👎flags=lanczos,split [o1] [o2];[o1] palettegen [p]; [o2] fifo [o3];[o3] [p] paletteuse',
'-f',
'gif',
'pipe:1'
]
}

benjaminadk • Dec 20 '20

I had similar problems using the Windows zip. I ended up using the Windows Subsystem For Linux environment to zip any files. There is a command line zip utility. This resolved the errors.

mrcrrs • Feb 7 '21 • Edited

I was also having the problem of the images being written coming out at 0 bytes. Also, regarding creating the layer, you don't need to package up ffmpeg yourself as you can just deploy a pre-made layer from the AWS SAR.

My solution ended up being quite different from that outlined here but your article did get me started; for others still having issues I have written a tutorial on my own blog which documents how I got it working.

emmano3h • Jan 18 '21

Hi Ben,
it was a great article.
I reproduced it using docker and S3. No lambda.
I will use your post to write an article on how to do this in a highly available manner.

I made a non-blocking web service which takes as parameter the url of the file on aws (use a cloudfront link, s3 link does not work with ffmpeg), the number of images to generate, the width and the length pictures to be produced. The web service receives the request, places it on redis workers (Pub / sub mode), and sends an immediate "scheduled" response. This makes it possible not to maintain the connection for a long time and to free the resource. It processes the request on the server workers, then sends the results back through SNS or a web socket (if that's your choice).
I do not authorize the processing of videos shorter than 4 seconds and longer than 1 hour. 4 seconds avoids generating empty or blank images. 1h is to check the consumed resources.
If the video lasts 19 seconds, put a check to avoid generating the image at second 19. It will be an empty image.
Depending on the number of images requested, I generate the requested screenshots.
If the duration of the video = 19 seconds and you request 3 images, that makes (19) / 3 = 7
image 1 = 7,
image 2 = 7 + 7 = 14,
image 3 = 7 + 7 + 7 = 21
Frame 3 is 21 seconds and 21 seconds longer than the video length, it will be skipped and so we generate only frame 1 and frame 2.
To avoid image name conflicts generated on disk, I generate my own file names.
const fileNameOnDisk = Math.floor (1000000000000 + Math.random () * 9000000000000);
When the image is generated and uploaded to AWS, I delete the files from disk asynchronously.
I added time markers to detect excessively long executions. We have three important stages:
1- Read the file with ffprobe to get the details of the file, mainly the duration of the video. (maximum 30 seconds)
2- Use ffmpeg to generate the screenshots (maximum 3 seconds per image)
3- Upload the files to s3 (maximum 5 seconds per image)
You should put markers on these three steps and therefore trigger the email log when it exceeds your maximum time.
This allows you to be proactive on all issues.

jaman99 • Dec 10 '19 • Edited

can you give me your aws lambda ffmpeg ARN number ??

View full discussion (15 comments)

DEV Community

How do I create thumbnails when I upload a video? aws lambda!

Introduction

Companion Video

Disclaimer

Lambda Functions

IAM Role Permissions

Triggers

Getting Code to Lambda

Lambda Layers

Disclaimer

Code

Tips

Resources

Conclusion

Top comments (15)

Read next

What is new feature of React version19

Build a Simple Chatbot with Svelte and ElizaBot

The unseen beauty of Antelope Canyon!!

Your coding year in review