DEV Community

Play Button Pause Button
Jimmy Guerrero for Voxel51

Posted on

Computer Vision Meetup: Towards Resource Efficient Robust Text-to-Image Generative Models

Text-to-image (T2I) diffusion models (such as Stable Diffusion XL, DALL-E 3, etc.) achieve state-of-the-art (SOTA) performance on various compositional T2I benchmarks, at the cost of significant computational resources. For instance, the unCLIP (i.e., DALL-E 2) stack comprises T2I prior and diffusion image decoder. The T2I prior model itself adds a billion parameters, increasing the computational and high-quality data requirements. Maitreya propose the ECLIPSE, a novel contrastive learning method that is both parameter and data-efficient as a way to combat these issues

Speaker: Maitreya Patel is a PHD student studying at Arizona State University focusing on model performance and efficiency. Whether it is model training or inference, Maitreya strives to make optimizations to make AI more accessible and powerful.

Not a Meetup member? Sign up to attend the next event:

https://voxel51.com/computer-vision-events/

Recorded on April 18, 2024 at the AI, Machine Learning and Data Science Meetup

Top comments (0)