Create account

DEV Community

Rerun

Posted on May 7

Use Hugging Face's ControlNet

#ai #machinelearning #python #opensource

Source Code Explore Other Examples

Hugging Face's ControlNet allows to condition Stable Diffusion on various modalities. In this example we condition on edges detected by the Canny edge detector to keep our shape intact while generating an image with Stable Diffusion. This generates a new image based on the text prompt, but that still retains the structure of the control image. We visualize the whole generation process with Rerun.

Logging and visualizing with Rerun

The visualizations in this example were created with the following Rerun code.

Images

rr.log("input/raw", rr.Image(image), timeless=True)
rr.log("input/canny", rr.Image(canny_image), timeless=True)

The input image and control canny_image are marked as timeless and logged in rerun.

Timeless entities belong to all timelines (existing ones, and ones not yet created) and are shown leftmost in the time panel in the viewer. This is useful for entities that aren't part of normal data capture, but set the scene for how they are shown.

This designation ensures their constant availability across all timelines in Rerun, aiding in consistent comparison and documentation.

Prompts

rr.log("positive_prompt", rr.TextDocument(prompt), timeless=True)
rr.log("negative_prompt", rr.TextDocument(negative_prompt), timeless=True)

The positive and negative prompt used for generation is logged to Rerun.

Custom diffusion step callback

We use a custom callback function for ControlNet that logs the output and the latent values at each timestep, which makes it possible for us to view all timesteps of the generation in Rerun.

def controlnet_callback(
    iteration: int, timestep: float, latents: torch.Tensor, pipeline: StableDiffusionXLControlNetPipeline
) -> None:
    rr.set_time_sequence("iteration", iteration)
    rr.set_time_seconds("timestep", timestep)
    rr.log("output", rr.Image(image))
    rr.log("latent", rr.Tensor(latents.squeeze(), dim_names=["channel", "height", "width"]))

Output image

Finally, we log the output image generated by ControlNet.

rr.log("output", rr.Image(images))

Join us on Github

rerun-io / rerun

Visualize streams of multimodal data. Fast, easy to use, and simple to integrate. Built in Rust using egui.

Build time aware visualizations of multimodal data

Use the Rerun SDK (available for C++, Python and Rust) to log data like images, tensors, point clouds, and text. Logs are streamed to the Rerun Viewer for live visualization or to file for later use.

A short taste

import rerun as rr  # pip install rerun-sdk
rr.init("rerun_example_app")

rr.connect()  # Connect to a remote viewer
# rr.spawn()  # Spawn a child process with a viewer and connect
# rr.save("recording.rrd")  # Stream all logs to disk

# Associate subsequent data with 42 on the “frame” timeline
rr.set_time_sequence("frame", 42)

# Log colored 3D points to the entity at `path/to/points`
rr.log("path/to/points", rr.Points3D(positions, colors=colors

…

View on GitHub

DEV Community

Use Hugging Face's ControlNet

Logging and visualizing with Rerun

Images

Prompts

Custom diffusion step callback

Output image

Join us on Github

rerun-io / rerun

Visualize streams of multimodal data. Fast, easy to use, and simple to integrate. Built in Rust using egui.

Build time aware visualizations of multimodal data

A short taste

Top comments (0)

Read next

Understanding OpenAI’s ChatGPT History Policy: What It Means for Users

A Beginner’s Guide to Component Testing with Playwright

Introducing Visual Copilot 2.0: Design to Interactive

AI Web Developer - GitHub Spark