Improve TensorFlow model load time by ~70% using HDF5 instead of SavedModel

#tensorflow #performance #cloud #ai

In our ongoing work running DeepCell on Google Batch, we noted that it takes ~9s to load the model into memory, whereas prediction (the interesting part of loading the model) takes ~3s for a 512x512 image.

The ideal runtime environment is serverless, so we don't have long-lived processes which would load the model once, to predict multiple samples across multiple jobs. Instead, each task instance needs to load the model before doing any work. So, it hurts when the model takes 3x the load time of the actual work… it certainly makes it inefficient to scale horizontally with one short-lived compute node per prediction.

My local machine (a macbook m3 max pro) took ~12 s to load the model, the slowest part of the entire preprocess → predict → postprocess pipeline.

I was curious why it took so long to load the model into memory. It's "only" ~100 MB on disk.

I came across TensorFlow Performance: Loading Models by Libor Vanek. It compares the load times for different formats. Here's the punchline:

I was intrigued 🤞🏻 could we get similar speed-ups just by changing the format?

Yes:

Environment	SavedModel	HDF5	Diff
Macbook M3 Max Pro	12.3 s	0.84 s	-11.46 s (-93%)
n1-standard-8 w/ 1 T4 GPU	8.99 s	2.68 s	-6.31 s (-70%)
n1-standard-32 w/ 1 T4 GPU	8.21s	2.72 s	-5.49 s (-67%)

Of note, loading the model into memory used to take ~3x the time of prediction. Now, it's roughly the same.

Converting the model was easy:

# Load the SavedModel version
model = tf.keras.models.load_model("/Users/davidhaley/.keras/models/MultiplexSegmentation")
# Save as HDF5
model.save("MultiplexSegmentation-resaved-20240710.h5")

We needed to adjust one factor: the load_model call needs an additional parameter to locate custom training objects:

from deepcell.layers.location import Location2D

# [...]

model = tf.keras.models.load_model(
    model_path,
    custom_objects={"Location2D": Location2D},
)

We learned this by importing the HDF5 file without the custom_objects and getting the error that Location2D wasn't found.

This is the only caveat we've found with the HDF5 format: needing to tell it where to find the custom objects. The prediction results appear to be the same.

70% just by using a different file format!

DEV Community

Improve TensorFlow model load time by ~70% using HDF5 instead of SavedModel

Top comments (0)

Read next

10 Types of AI - Detailed Guide

Building an AI Tattoo Generator with Next.js

Why Code Reuse is Important in the Age of AI

Using AI for Real-Time Customer Sentiment Tracking