Try it in browser Source Code Explore Other Examples
This post is a guide for visualizing a 3D indoor scene captured using Apple's ARKit technology with the open-source visualization tool Rerun.
ARKitScenes Dataset
The ARKitScenes dataset, captured using Apple's ARKit technology, encompasses a diverse array of indoor scenes.
Every 3D indoor scene contains:
- Colour and Depth Images
- Reconstructed 3D Meshes
- Labelled Bounding Boxes Around Objects If you want to learn more about the scene structure, the data organization and structure of scenes are explained here.
Logging and Visualising with Rerun
Entities and Components
Rerun uses an Entity Component System architecture pattern in which entities represent generic objects while components describe data associated with those entities.
In our example, we have these entities:
- world entity: includes 3D mesh data (world/mesh ), pinhole camera (world/mesh ), and annotiations (world/annotations )
- video entity: includes RGB images (video/rgb) and depth images (video/depth )
You can learn more on this page Entities and Components.
Log a moving RGB-D camera
To log a moving RGB-D camera, we log four key components: the camera's intrinsics via a pinhole camera model, its pose or extrinsics, along with the color and depth images. Both the RGB and depth images are then logged as child entities, capturing the visual and depth aspects of the scene, respectively.
# Log Pinhole Camera and its transforms
rr.log("world/camera_lowres", rr.Transform3D(transform=camera_from_world))
rr.log("world/camera_lowres", rr.Pinhole(image_from_camera=intrinsic, resolution=[w, h]))
# Log RGB Image
rr.log("video/rgb", rr.Image(rgb).compress(jpeg_quality=95))
# Log Depth Image
rr.log("video/depth", rr.DepthImage(depth, meter=1000))
Here's a breakdown of the steps:
- Pinhole camera is utilized for achieving a 3D view and camera perspective through the use of the Pinhole and Transform3D archetypes.
- The RGB images are logged as Image archetype.
- The Depth images are logged as Depth archetype.
Log 3D Mesh
The mesh is composed of mesh vertices, indices (i.e., which vertices belong to the same face), and vertex colors.
# ... load mesh data from dataset ...
rr.log(
"world/mesh",
rr.Mesh3D(
vertex_positions=mesh.vertices,
vertex_colors=mesh.visual.vertex_colors,
indices=mesh.faces,
),
timeless=True,
)
Here, the mesh is logged to the world/mesh entity using Mesh3D archetype and is marked as timeless, since it does not change in the context of this visualization.
3D Bounding Boxes
Here we loop through the data and add bounding boxes to all the items found.
# .. load annotation data from dataset ...
for i, label_info in enumerate(annotation["data"]):
rr.log(
f"world/annotations/box-{uid}-{label}",
rr.Boxes3D(
half_sizes=half_size,
centers=centroid,
rotations=rr.Quaternion(xyzw=rot.as_quat()),
labels=label,
colors=colors[i],
),
timeless=True,
)
The bounding boxes are logged as Boxes3D archetype.
Join us on Github
rerun-io / rerun
Visualize streams of multimodal data. Fast, easy to use, and simple to integrate. Built in Rust using egui.
Build time aware visualizations of multimodal data
Use the Rerun SDK (available for C++, Python and Rust) to log data like images, tensors, point clouds, and text. Logs are streamed to the Rerun Viewer for live visualization or to file for later use.
A short taste
import rerun as rr # pip install rerun-sdk
rr.init("rerun_example_app")
rr.connect() # Connect to a remote viewer
# rr.spawn() # Spawn a child process with a viewer and connect
# rr.save("recording.rrd") # Stream all logs to disk
# Associate subsequent data with 42 on the “frame” timeline
rr.set_time_sequence("frame", 42))
# Log colored 3D points to the entity at `path/to/points`
rr.log("path/to/points", rr.Points3D(positions, colors=colors
…
Top comments (0)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.