In our ongoing work running DeepCell on Google Batch, we noted that it takes ~9s to load the model into memory, whereas prediction (the interesting part of loading the model) takes ~3s for a 512x512 image.
The ideal runtime environment is serverless, so we don't have long-lived processes which would load the model once, to predict multiple samples across multiple jobs. Instead, each task instance needs to load the model before doing any work. So, it hurts when the model takes 3x the load time of the actual work… it certainly makes it inefficient to scale horizontally with one short-lived compute node per prediction.
My local machine (a macbook m3 max pro) took ~12 s to load the model, the slowest part of the entire preprocess → predict → postprocess pipeline.
I was curious why it took so long to load the model into memory. It's "only" ~100 MB on disk.
I came across TensorFlow Performance: Loading Models by Libor Vanek. It compares the load times for different formats. Here's the punchline:
I was intrigued 🤞🏻 could we get similar speed-ups just by changing the format?
Yes:
Environment | SavedModel | HDF5 | Diff |
---|---|---|---|
Macbook M3 Max Pro | 12.3 s | 0.84 s | -11.46 s (-93%) |
n1-standard-8 w/ 1 T4 GPU | 8.99 s | 2.68 s | -6.31 s (-70%) |
n1-standard-32 w/ 1 T4 GPU | 8.21s | 2.72 s | -5.49 s (-67%) |
Of note, loading the model into memory used to take ~3x the time of prediction. Now, it's roughly the same.
Converting the model was easy:
# Load the SavedModel version
model = tf.keras.models.load_model("/Users/davidhaley/.keras/models/MultiplexSegmentation")
# Save as HDF5
model.save("MultiplexSegmentation-resaved-20240710.h5")
We needed to adjust one factor: the load_model
call needs an additional parameter to locate custom training objects:
from deepcell.layers.location import Location2D
# [...]
model = tf.keras.models.load_model(
model_path,
custom_objects={"Location2D": Location2D},
)
We learned this by importing the HDF5 file without the custom_objects
and getting the error that Location2D
wasn't found.
This is the only caveat we've found with the HDF5 format: needing to tell it where to find the custom objects. The prediction results appear to be the same.
70% just by using a different file format!
Top comments (0)