DEV Community

Cover image for Generating Videos from Images with Stable Video Diffusion and FiftyOne
Jimmy Guerrero for Voxel51

Posted on • Originally published at voxel51.com

Generating Videos from Images with Stable Video Diffusion and FiftyOne

Author: Dan Gural (Machine Learning Engineer at Voxel51)

Bring images to life in seconds

Stability AI just released Stable Video Diffusion, allowing a whole new way to experience GenAI! By passing in an image, you can now breathe life into it, allowing for a short clip to be generated. Now, you can use Stable Video Diffusion with your FiftyOne datasets with my new image to video plugin!

Image description

Installation

To get started, you will need two things, FiftyOne installed and a replicate account to run the model through its API. After you have an account set up and FiftyOne installed, in you command terminal run the following:

fiftyone plugins download https://github.com/danielgural/img_to_video_plugin
Enter fullscreen mode Exit fullscreen mode

This will install the plugin into your FiftyOne app. Afterwards, we set our replicate token environment variable as well.

export REPLICATE_API_TOKEN=<your token>
Enter fullscreen mode Exit fullscreen mode

Execution

From there, load up any FiftyOne dataset, quickstart for example, and select the sample or samples that you want to turn into videos using Stable Video Diffusion. We will be using this replicate endpoint for Stable Video DIffusion.

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart")

session = fo.launch_app(dataset)
Enter fullscreen mode Exit fullscreen mode

Use the “ “ or backtick on your keyboard when in the app to open the operator menu, type in theimg2video` operator. From there you will get several options to change settings for the model run. The different parameters can be found here.

Image description

After executing, the results will be saved in a new dataset called image2video. The mp4 will be saved alongside wherever the original data is. Explore different inputs and play with different images. You never know how it might turn out! Here are some of my results:

In

Image description

Out

Image description

In

Image description

Out

Image description

In

Image description

Out

Image description

Conclusion

No matter what image you put in, you should be set for a good laugh. I found that using the large 24 frame model typically leads to falling off the rails as seen in the last output. I do have to say that playing with this plugin is addicting and it is so much fun to plug more and more pictures in. I can't wait to see what cool ideas and works come from Stable Video Diffusion!

As always, be sure to check out all of our FiftyOne Plugins for more GenAI plugins for computer vision plus much, much more! If you are interested in building your own plugins, hop into the community slack to join other developers and get access to tons of resources!

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.