DEV Community

Cover image for How to Visualize Structure-from-Motion
Rerun
Rerun

Posted on • Edited on

How to Visualize Structure-from-Motion

Try it in browser Source Code Explore Other Examples

COLMAP is a general-purpose Structure-from-Motion (SfM) and Multi-View Stereo (MVS) pipeline. In this example, a short video clip has been processed offline using the COLMAP pipeline. The processed data was then visualized using Rerun, which allowed for the visualization of individual camera frames, estimation of camera poses, and creation of point clouds over time. By using COLMAP in combination with Rerun, a highly-detailed reconstruction of the scene depicted in the video was generated.


Logging and visualizing with Rerun

The visualizations in this example were created with the following Rerun code:


Timelines

All data logged using Rerun in the following sections is connected to a specific frame. Rerun assigns a frame id to each piece of logged data, and these frame ids are associated with a timeline.



rr.set_time_sequence("frame", frame_idx)


Enter fullscreen mode Exit fullscreen mode

Images

The images are logged through the Image to the camera/image entity.



rr.log("camera/image", rr.Image(rgb).compress(jpeg_quality=75))


Enter fullscreen mode Exit fullscreen mode

Cameras

The images stem from pinhole cameras located in the 3D world. To visualize the images in 3D, the pinhole projection has to be logged and the camera pose (this is often referred to as the intrinsics and extrinsics of the camera, respectively).

The Pinhole is logged to the camera/image entity and defines the intrinsics of the camera. This defines how to go from the 3D camera frame to the 2D image plane. The extrinsics are logged as an Transform3D to the camera entity.



rr.log("camera", rr.Transform3D(translation=image.tvec, rotation=rr.Quaternion(xyzw=quat_xyzw), from_parent=True))

rr.log(
    "camera/image",
    rr.Pinhole(
        resolution=[camera.width, camera.height],
        focal_length=camera.params[:2],
        principal_point=camera.params[2:],
    ),
)


Enter fullscreen mode Exit fullscreen mode

2D points

The 2D image points that are used to triangulate the 3D points are visualized by logging as Points2D to the camera/image/keypoints entity. Note that these keypoints are a child of the camera/image entity, since the points should show in the image plane.

Image description



rr.log("camera/image/keypoints", rr.Points2D(visible_xys, colors=[34, 138, 167]))


Enter fullscreen mode Exit fullscreen mode

3D points

The colored 3D points were added to the visualization by logging the Points3D archetype to the points entity.

Image description



rr.log("points", rr.Points3D(points, colors=point_colors), rr.AnyValues(error=point_errors))


Enter fullscreen mode Exit fullscreen mode

Reprojection error

For each image a Scalar archetype containing the average reprojection error of the keypoints is logged to the plot/avg_reproj_err entity.

Image description



rr.log("plot/avg_reproj_err", rr.Scalar(np.mean(point_errors)))


Enter fullscreen mode Exit fullscreen mode

Join us on Github

GitHub logo rerun-io / rerun

Visualize streams of multimodal data. Fast, easy to use, and simple to integrate. Built in Rust using egui.

Build time aware visualizations of multimodal data

Use the Rerun SDK (available for C++, Python and Rust) to log data like images, tensors, point clouds, and text. Logs are streamed to the Rerun Viewer for live visualization or to file for later use.

A short taste

import rerun as rr  # pip install rerun-sdk
rr.init("rerun_example_app")

rr.connect()  # Connect to a remote viewer
# rr.spawn()  # Spawn a child process with a viewer and connect
# rr.save("recording.rrd")  # Stream all logs to disk

# Associate subsequent data with 42 on the “frame” timeline
rr.set_time_sequence("frame", 42)

# Log colored 3D points to the entity at `path/to/points`
rr.log("path/to/points", rr.Points3D(positions, colors=colors
Enter fullscreen mode Exit fullscreen mode

Top comments (0)