Source Code Explore Other Examples
Hugging Face's ControlNet allows to condition Stable Diffusion on various modalities. In this example we condition on edges detected by the Canny edge detector to keep our shape intact while generating an image with Stable Diffusion. This generates a new image based on the text prompt, but that still retains the structure of the control image. We visualize the whole generation process with Rerun.
Logging and visualizing with Rerun
The visualizations in this example were created with the following Rerun code.
Images
rr.log("input/raw", rr.Image(image), timeless=True)
rr.log("input/canny", rr.Image(canny_image), timeless=True)
The input image and control canny_image are marked as timeless and logged in rerun.
Timeless entities belong to all timelines (existing ones, and ones not yet created) and are shown leftmost in the time panel in the viewer. This is useful for entities that aren't part of normal data capture, but set the scene for how they are shown.
This designation ensures their constant availability across all timelines in Rerun, aiding in consistent comparison and documentation.
Prompts
rr.log("positive_prompt", rr.TextDocument(prompt), timeless=True)
rr.log("negative_prompt", rr.TextDocument(negative_prompt), timeless=True)
The positive and negative prompt used for generation is logged to Rerun.
Custom diffusion step callback
We use a custom callback function for ControlNet that logs the output and the latent values at each timestep, which makes it possible for us to view all timesteps of the generation in Rerun.
def controlnet_callback(
iteration: int, timestep: float, latents: torch.Tensor, pipeline: StableDiffusionXLControlNetPipeline
) -> None:
rr.set_time_sequence("iteration", iteration)
rr.set_time_seconds("timestep", timestep)
rr.log("output", rr.Image(image))
rr.log("latent", rr.Tensor(latents.squeeze(), dim_names=["channel", "height", "width"]))
Output image
Finally, we log the output image generated by ControlNet.
rr.log("output", rr.Image(images))
Join us on Github
rerun-io / rerun
Visualize streams of multimodal data. Fast, easy to use, and simple to integrate. Built in Rust using egui.
Build time aware visualizations of multimodal data
Use the Rerun SDK (available for C++, Python and Rust) to log data like images, tensors, point clouds, and text. Logs are streamed to the Rerun Viewer for live visualization or to file for later use.
A short taste
import rerun as rr # pip install rerun-sdk
rr.init("rerun_example_app")
rr.connect() # Connect to a remote viewer
# rr.spawn() # Spawn a child process with a viewer and connect
# rr.save("recording.rrd") # Stream all logs to disk
# Associate subsequent data with 42 on the “frame” timeline
rr.set_time_sequence("frame", 42)
# Log colored 3D points to the entity at `path/to/points`
rr.log("path/to/points", rr.Points3D(positions, colors=colors
…
Top comments (0)