DEV Community

David Haley
David Haley

Posted on

Improve TensorFlow model load time by ~70% using HDF5 instead of SavedModel

In our ongoing work running DeepCell on Google Batch, we noted that it takes ~9s to load the model into memory, whereas prediction (the interesting part of loading the model) takes ~3s for a 512x512 image.

The ideal runtime environment is serverless, so we don't have long-lived processes which would load the model once, to predict multiple samples across multiple jobs. Instead, each task instance needs to load the model before doing any work. So, it hurts when the model takes 3x the load time of the actual work… it certainly makes it inefficient to scale horizontally with one short-lived compute node per prediction.

My local machine (a macbook m3 max pro) took ~12 s to load the model, the slowest part of the entire preprocess → predict → postprocess pipeline.

I was curious why it took so long to load the model into memory. It's "only" ~100 MB on disk.

I came across TensorFlow Performance: Loading Models by Libor Vanek. It compares the load times for different formats. Here's the punchline:

Chart of load times for SavedModel vs HDF5 showing a drop from ~10s to ~2s

I was intrigued 🤞🏻 could we get similar speed-ups just by changing the format?

Yes:

Environment SavedModel HDF5 Diff
Macbook M3 Max Pro 12.3 s 0.84 s -11.46 s (-93%)
n1-standard-8 w/ 1 T4 GPU 8.99 s 2.68 s -6.31 s (-70%)
n1-standard-32 w/ 1 T4 GPU 8.21s 2.72 s -5.49 s (-67%)

Of note, loading the model into memory used to take ~3x the time of prediction. Now, it's roughly the same.

Converting the model was easy:

# Load the SavedModel version
model = tf.keras.models.load_model("/Users/davidhaley/.keras/models/MultiplexSegmentation")
# Save as HDF5
model.save("MultiplexSegmentation-resaved-20240710.h5")
Enter fullscreen mode Exit fullscreen mode

We needed to adjust one factor: the load_model call needs an additional parameter to locate custom training objects:

from deepcell.layers.location import Location2D

# [...]

model = tf.keras.models.load_model(
    model_path,
    custom_objects={"Location2D": Location2D},
)
Enter fullscreen mode Exit fullscreen mode

We learned this by importing the HDF5 file without the custom_objects and getting the error that Location2D wasn't found.

This is the only caveat we've found with the HDF5 format: needing to tell it where to find the custom objects. The prediction results appear to be the same.

70% just by using a different file format!

Top comments (0)