Jimmy Guerrero for Voxel51

Posted on Jun 27 • Originally published at voxel51.com

Recapping the AI, Machine Learning and Computer Meetup — June 27, 2024

#computervision #machinelearning #datascience #ai

We just wrapped up the May '24 AI, Machine Learning and Data Science Meetup, and if you missed it or want to revisit it, here's a recap!
In this blog post you'll find the playback recordings, highlights from the presentations and Q&A, as well as the upcoming Meetup schedule so that you can join us at a future event.

First, Thanks for Voting for Your Favorite Charity!

In lieu of swag, we gave Meetup attendees the opportunity to help guide a $200 donation to charitable causes. The charity that received the highest number of votes this month was Heart to Heart International, an organization that ensures quality care is provided equitably in medically under-resourced communities and in disaster situations. We are sending this event's charitable donation of $200 to Heart to Heart International on behalf of the Meetup members!

Missed the Meetup? No problem. Here are playbacks and talk abstracts from the event.

Leveraging Pre-trained Text2Image Diffusion Models for Zero-Shot Video Editing

Text-to-image diffusion models demonstrate remarkable editing capabilities in the image domain, especially after Latent Diffusion Models made diffusion models more scalable. Conversely, video editing still has much room for improvement, particularly given the relative scarcity of video datasets compared to image datasets. Therefore, we will discuss whether pre-trained text-to-image diffusion models can be used for zero-shot video editing without any fine-tuning stage. Finally, we will also explore possible future work and interesting research ideas in the field.

Speaker: Bariscan Kurtkaya is a KUIS AI Fellow and a graduate student in the Department of Computer Science at Koc University. His research interests lie in exploring and leveraging the capabilities of generative models in the realm of 2D and 3D data, encompassing scientific observations from space telescopes.

Resource Links

Q&A

Could this be applied to few shot or zero shot learning? In particular, could paraphrasing the object description be used by the model to detect objects not present in the training dataset?
Are Lineart and Softedge is edge filters?

Improved Visual Grounding through Self-Consistent Explanations

Vision-and-language models that are trained to associate images with text have shown to be effective for many tasks, including object detection and image segmentation. In this talk, we will discuss how to enhance vision-and-language models’ ability to localize objects in images by fine-tuning them for self-consistent visual explanations. We propose a method that augments text-image datasets with paraphrases using a large language model and employs SelfEQ, a weakly-supervised strategy that promotes self-consistency in visual explanation maps. This approach broadens the model’s working vocabulary and improves object localization accuracy, as demonstrated by performance gains on competitive benchmarks.

Speaker: Dr. Paola Cascante-Bonilla received her Ph.D. in Computer Science at Rice University in 2024, advised by Professor Vicente Ordóñez Román, working on Computer Vision, Natural Language Processing, and Machine Learning. She received a Master of Computer Science at the University of Virginia and a B.S. in Engineering at the Tecnológico de Costa Rica. Paola will join Stony Brook University (SUNY) as an Assistant Professor in the Department of Computer Science. Ruozhen (Catherine) He is a first-year Computer Science PhD student at Rice University, advised by Prof. Vicente Ordóñez, focusing on efficient algorithms in computer vision with less or multimodal supervision. She aims to leverage insights from neuroscience and cognitive psychology to develop interpretable algorithms that achieve human-level intelligence across versatile tasks.

Resource links

Paper: Improved Visual Grounding through Self-Consistent Explanations
Deep dive discussion with the authors and Prof Jason Corso and Harpreet Sahota

Q&A

Could this be applied to few shot or zero shot learning? In particular, could paraphrasing the object description be used by the model to detect objects not present in the training dataset?
Are Lineart and Softedge is edge filters?

Combining Hugging Face Transformer Models and Image Data with FiftyOne

Datasets and Models are the two pillars of modern machine learning, but connecting the two can be cumbersome and time-consuming. In this lightning talk, you will learn how the seamless integration between Hugging Face and FiftyOne simplifies this complexity, enabling more effective data-model co-development. By the end of the talk, you will be able to download and visualize datasets from the Hugging Face hub with FiftyOne, apply state-of-the-art transformer models directly to your data, and effortlessly share your datasets with others.

Speaker: Jacob Marks, PhD is a Machine Learning Engineer and Developer Evangelist at Voxel51, where he leads open source efforts in vector search, semantic search, and generative AI for the FiftyOne data-centric AI toolkit. Prior to joining Voxel51, Jacob worked at Google X, Samsung Research, and Wolfram Research.

Resource links

Join the AI, Machine Learning and Data Science Meetup!

The combined membership of the Computer Vision and AI, Machine Learning and Data Science Meetups has grown to over 20,000 members! The goal of the Meetups is to bring together communities of data scientists, machine learning engineers, and open source enthusiasts who want to share and expand their knowledge of AI and complementary technologies.

Join one of the 12 Meetup locations closest to your timezone.

What’s Next?

Up next on July 3rd, 2024 at 2:00 PM BST and 6:30 PM IST, we have three great speakers lined up!

Performance Optimisation for Multimodal LLMs- Neha Sharma, Technical PM at Ori Industries
5 Handy Ways to Use Embeddings, the Swiss Army Knife of AI- Harpreet Sahota, Hacker-in-residence at Voxel51
Deep Dive: Responsible and Unbiased GenAI for Computer Vision - Daniel Gural - ML Engineer at Voxel51

Get Involved!

There are a lot of ways to get involved in the Computer Vision Meetups. Reach out if you identify with any of these:

You’d like to speak at an upcoming Meetup
You have a physical meeting space in one of the Meetup locations and would like to make it available for a Meetup
You’d like to co-organize a Meetup
You’d like to co-sponsor a Meetup

Reach out to Meetup co-organizer Jimmy Guerrero on Meetup.com or ping me over LinkedIn to discuss how to get you plugged in.

These Meetups are sponsored by Voxel51, the company behind the open source FiftyOne computer vision toolset. FiftyOne enables data science teams to improve the performance of their computer vision models by helping them curate high quality datasets, evaluate models, find mistakes, visualize embeddings, and get to production faster. It’s easy to get started, in just a few minutes.

DEV Community

Recapping the AI, Machine Learning and Computer Meetup — June 27, 2024

First, Thanks for Voting for Your Favorite Charity!

Leveraging Pre-trained Text2Image Diffusion Models for Zero-Shot Video Editing

Resource Links

Q&A

Improved Visual Grounding through Self-Consistent Explanations

Resource links

Q&A

Combining Hugging Face Transformer Models and Image Data with FiftyOne

Resource links

Join the AI, Machine Learning and Data Science Meetup!

What’s Next?

Get Involved!

Top comments (0)

Read next

Generative AI Misuse: A Taxonomy of Tactics and Insights from Real-World Data

Popular Algorithms in Machine Learning Explained

OpenAI Will Terminate Its Services in China: A Comprehensive Analysis

Transcendence: Generative Models Can Outperform The Experts That Train Them