Jimmy Guerrero for Voxel51

Posted on Nov 21, 2024 • Originally published at voxel51.com

Recapping ECCV 2024 Redux: Day 3

#computervision #machinelearning #ai #datascience

We just wrapped up Day 3 of ECCV 2024 Redux. If you missed it or want to revisit it, here’s a recap!

In this blog post you’ll find the playback recordings, highlights from the presentations and Q&A, as well as the upcoming Meetup schedule so that you can join us at a future event.

Closing the Gap Between Satellite and Street-View Imagery Using Generative Models

With the growing availability of satellite imagery (e.g., Google Earth), nearly every part of the world can be mapped, though street-view images remain limited. Creating street views from satellite data is crucial for applications like virtual model generation, media content enhancement, 3D gaming, and simulations. This task, known as satellite-to-ground cross-view synthesis, is tackled by our geometry-aware framework, which maintains geometric precision and relative geographical positioning using satellite information.

ECCV 2024 Paper: Geospecific View Generation — Geometry-Context Aware High-resolution Ground View Inference from Satellite Views

Speaker: Ningli Xu is a Ph.D. student at The Ohio State University, specializing in generative AI and computer vision, with a focus on addressing image and video generation challenges in the geospatial domain.

Q&A

How does your method handle overhanging structures with shaded or areas hidden from overhead satellite?
Can your system integrate together satellite views that are taken from different viewing angles?
How does google earth create its 3D models?
Besides DFC2019, what other data sources are used for the ground-level imagery in Google Street View?
Does your method only works on specific fixed viewpoints?

High-Efficiency 3D Scene Compression Using Self-Organizing Gaussians

In just over a year, 3D Gaussian Splatting (3DGS) has made waves in computer vision for its remarkable speed, simplicity, and visual quality. Yet, even scenes of a single room can exceed a gigabyte in size, making it difficult to scale up to larger environments, like city blocks. In this talk, we’ll explore compression techniques to reduce the 3DGS memory footprint. We’ll dive deeply into our novel approach, Self-Organizing Gaussians, which proposes to map splatting attributes into a 2D grid, using a high-performance parallel linear assignment sorting developed to reorganize the splats on the fly. This grid assignment allows us to leverage traditional 2D image compression techniques like JPEG to efficiently store 3D data. Our method is quick and easy to decompress and provides a surprisingly competitive compression ratio. The drastically reduced memory requirements make this method perfect for efficiently streaming 3D scenes at large scales, which is especially useful for AR, VR and gaming applications.

ECCV 2024 Paper: Compact 3D Scene Representation via Self-Organizing Gaussian Grids

Speaker: Wieland Morgenstern is a Research Associate at the Computer Vision & Graphics group at Fraunhofer HHI and is pursuing a PhD at Humboldt University Berlin. His research focuses on representing 3D scenes and virtual humans.

Q&A

What is the source of the floaters?
Why does your technique eliminate some of the floaters?
How do your results compare with Niantic’s.spz compression?
In the early slides, Gaussian reduction animated demos (with the bikes), is this view rendered from one of the source image perspectives or from a novel viewpoint? Also does the reduction of Gaussians without quality loss true for many viewpoints?

Skeleton Recall Loss for Connectivity Conserving and Resource Efficient Segmentation of Thin Tubular Structures

We present Skeleton Recall Loss, a novel loss function for topologically accurate and efficient segmentation of thin, tubular structures, such as roads, nerves, or vessels. By circumventing expensive GPU-based operations, we reduce computational overheads by up to 90% compared to the current state-of-the-art, while achieving overall superior performance in segmentation accuracy and connectivity preservation. Additionally, it is the first multi-class capable loss function for thin structure segmentation.

ECCV 2024 Paper: Skeleton Recall Loss for Connectivity Conserving and Resource Efficient Segmentation of Thin Tubular Structures

Github Repo: Link

Speaker: Maximilian Rokuss holds a M.Sc. in Physics from Heidelberg University, now PhD Student in Medical Image Computing at German Cancer Research Center (DKFZ) and Heidelberg University. Yannick Kirchoff holds a M.Sc. in Physics from Heidelberg University, now PhD Student in Medical Image Computing at German Cancer Research Center (DKFZ) and Helmholtz Information and Data Science School for Health

Q&A

Please explain the use of the term differentiable in this case?
How repeatable are your segmentations from different viewpoints, and in what situations does that repeatability break down?

What’s Next at ECCV 2024 Redux?

Missed Day 1 and Day 3 at ECCV 2024 Redux? Register for the Zoom here for Day 4 events.

𝗗𝗮𝘆 𝟰: 𝗡𝗼𝘃𝗲𝗺𝗯𝗲𝗿 𝟮𝟮

Zero-shot Video Anomaly Detection: Leveraging Large Language Models for Rule-Based Reasoning by Yuchen Yang from Johns Hopkins Whiting School of Engineering
Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models by Xiaoyu Zhu from Carnegie Mellon University

Dive into the groundbreaking research, all from the comfort of your own space.

You can find a complete schedule of upcoming Meetups on the Voxel51 Events page.

Get Involved!

There are a lot of ways to get involved in the Computer Vision Meetups. Reach out if you identify with any of these:

You’d like to speak at an upcoming Meetup
You have a physical meeting space in one of the Meetup locations and would like to make it available for a Meetup
You’d like to co-organize a Meetup
You’d like to co-sponsor a Meetup

Reach out to Meetup co-organizer Jimmy Guerrero on Meetup.com or ping me over LinkedIn to discuss how to get you plugged in.

—

These Meetups are sponsored by Voxel51, the company behind the open source FiftyOne computer vision toolset. FiftyOne enables data science teams to improve the performance of their computer vision models by helping them curate high quality datasets, evaluate models, find mistakes, visualize embeddings, and get to production faster. It’s easy to get started, in just a few minutes.

DEV Community