ONNX is an open format from the Linux Foundation to represent machine learning models. It is becoming extensively adopted by the Machine Learning community and is compatible with most of the machine learning frameworks like PyTorch, TensorFlow, etc. Converting a model between any of those formats and ONNX is really simple and can be done in most cases with a single command.
Pipeless, on the other hand, is an open-source computer vision framework that allows you to analyze and edit video streams in real-time effortlessly. The experience is similar to when you create a serverless web application, you just worry about your specific function and leave the rest to the magic.
Until now, when you needed to execute inference in Pipeless, you needed to import the model framework, for example PyTorch, load the model on it and then call the appropriate run function on the model. That not only produced heavy applications, but you also required to implement the logic or configuration to run the inference on GPUs (If you have ever fought with that, you will probably agree that it is not a pleasant job).
Using ONNX as the standard model representation format for Pipeless seems really convenient, and thus, by embedding the ONNX Runtime within Pipeless, running inference becomes trivial.
With the latest release, the Pipeless worker ships the ONNX runtime built-in. This means you just need to provide a model file or a URL where the model file is hosted, and Pipeless will automatically load it into the ONNX runtime and run inference for every frame of the video stream out of the box. But there is more! Pipeless also takes care of the most common pre-processing tasks automatically by detecting the model input format, such as image reshaping, transposing image dimensions, etc. Thanks to that you can save time in writing pre-processing code, or even not write at all. Also, since it is an extended practice to use a second model to pre-process data and then chain that data with the actual model for inference, Pipeless also gives you the option to specify a second model for pre-processing, and it handles all the model chaining automatically, including the model versions migration if required. Finally, if all those options are not yet enough for your use case, you can still implement more logic for pre and post-processing in Python, or combine all the mentioned options.
I created an example application for object detection making use of all the described above that you can run in just a minute. To avoid making this post longer, you can find the instructions to execute it here. Hope you enjoy it!