DEV Community

David Haley
David Haley

Posted on • Edited on

Optimizing QuPath intensity measurements: 12.5 hr to 2min

Spatial biology analyzes tissue sample images to derive patterns and data. A key first step is identifying cells on the image and gathering quantitative measurements about those cells.

In our ongoing work scaling DeepCell on GCP Batch, we'd previously gotten pretty efficient at the first part: segmenting the image into cells. But we hit a major performance roadblock for the next step: generating quantitative measurements.

The measurements are fairly straightforward:

  • size of each cell (convert pixels in each detected cell to physical dimensions, assuming some number of microns per pixel)
  • pixel intensity of each cell

Of note, for a ~140M pixel image, it took about 12.5 hours (‼️) to measure the detected cells. That's … not great 😩 What the heck?? We're just counting number of pixels, and pixel values. An HD image is ~2 M pixels, and computers (and TVs & phones) render >30 of those per second.

Profiling to the rescue. The great thing about JVM code is that it's extremely easy to profile. Just click "profile" instead of "run".

Screenshot of profiler button

Here's the resulting flamegraph.

IntelliJ flamegraph adding cell measurements

Of note, 99.9% of adding intensity measurements–84% of the total time–is spent simply reading the image.

Screenshot of time spent in readRegion: 84.25% of all, 99.88% of parent, amounting to 79.5 seconds

OK: so we need to not read the image repeatedly. In our case, the entire image can (for now) fit into RAM. If only we could simply prefetch the image, then read regions out of that in-memory image.

Sounds like a great use case for the Proxy pattern. We need an ImageServer that behaves just like the original image server, except, it reads from an in-memory image not from disk (or wherever the wrapped server reads).

The resulting code is quite simple. Here's the pull request. We override the abstract ImageServer, wrapping another ImageServer and forwarding all methods to the original.

UPDATE 2024-09-10: Thanks to Adrián Szegedi (GitHub HawkSK) the code is even simpler (PR#42): no need to explicitly forward methods. Instead we use Kotlin's delegation syntax which implicitly forwards non-overridden methods. This removes 100 lines of boilerplate 💪🏻

The one non-forwarded method is the core operation: reading a region.

That one turns into extracting the region from the entire (prefetched) image:

  private fun readFullImage() {
    if (prefetchedImage != null)
      return

    logger.info("Prefetching full image at path: ${wrappedImageServer.path}")

    val wholeImageRequest = RegionRequest.createInstance(
      wrappedImageServer.path,
      1.0,
      0,
      0,
      wrappedImageServer.width,
      wrappedImageServer.height
    )
    prefetchedImage = wrappedImageServer.readRegion(wholeImageRequest)
  }

  override fun readRegion(request: RegionRequest?): BufferedImage {
    if (request?.z != 0 || request.t != 0)
      throw IllegalArgumentException("PrefetchedImageServer only supports z=0 and t=0")

    readFullImage()
    return prefetchedImage!!.getSubimage(request!!.x, request.y, request.width, request.height)
  }
Enter fullscreen mode Exit fullscreen mode

This way, we only read the image once, and fetch all subregions from the in-memory image.

Here's the speed-up in the real-world (Google Batch)

Google Batch jobs showing new runtime 2min 14s, and old runtime 12hr 25min

Before (min) After (min) Delta
745 2 -743 min (-99.7%)

In the words of the great Tina Turner: Boom, Shaka Laka.

Top comments (0)