DEV Community

Cover image for Operationalize TensorFlow Models With ML.NET
Marius Muntean
Marius Muntean

Posted on

Operationalize TensorFlow Models With ML.NET

Let’s have a look at how to use a pre-trained TensorFlow model with ML.NET to make landmark predictions.

Get the Model

First we’re going to go ahead and pick a pre-trained model. There are multiple good sources for pre-trained models like Hugging Face and tfhub. We’re going to use a model from tfhub that predicts north American landmarks from images —

Transform and Optimize the Model

Theoretically we could start using the model as it is, but we can do better. The TensorFlow model isn’t easy to use on all possible platforms like Linux, macOS and Windows, and on all CPU architectures like ARM64.

ONNX is a format for representing machine learning models in a portable way. Additionally, ONNX models can be easily optimized and thus become smaller and faster.

The easiest way to transform the downloaded TensorFlow model to an ONNX model is to use the tool tf2onnx from

Follow the instructions to install it (use a dev container with python 3.10 to keep your machine clean) and then run this command:

python -m tf2onnx.convert --opset 16 --tflite lite-model_on_device_vision_classifier_landmarks_classifier_north_america_V1_1.tflite --output lite-model_on_device_vision_classifier_landmarks_classifier_north_america_V1_1.onnx
Enter fullscreen mode Exit fullscreen mode

You should now have a file called lite-model_on_device_vision_classifier_landmarks_classifier_north_america_V1_1.onnx that contains the optimized ONNX model. The file size shrank from 50.9MB to 42.7MB. Nice!

If you happen to start with an ONNX model that you still want to optimize, then you can use the official ONNX optimizer tool

Make the ONNX Model Available to ML.NET

In this step we’re telling ML.NET what the inputs and outputs of the model are and we’re packaging the model in a way that ML.NET can work with it.

Inputs and Outputs

The documentation already tells us that the “Inputs are expected to be 3-channel RGB color images of size 321 x 321, scaled to [0, 1]”, and that the output is “A vector of 99424 similarity scores”.

We need to find out the exact input and output tensor names. A tool like Netron makes this super easy. Open the original .tflite and/or the ONNX model in Netron and click the Model Properties button in the lower left corner.

Netron shows the model inputs and outputs

For our model the input is called uint8_image_input and the output is called transpose_1. We’ll make a note of those.

Package Model for ML.NET

Create a new console application, add these package references and restore them

<PackageReference Include="Microsoft.ML" Version="2.0.1" />
<PackageReference Include="Microsoft.ML.ImageAnalytics" Version="2.0.1" />
<PackageReference Include="Microsoft.ML.OnnxRuntime" Version="1.15.1" />
<PackageReference Include="Microsoft.ML.OnnxTransformer" Version="2.0.1" />
Enter fullscreen mode Exit fullscreen mode

Next, to make our life easier and the code tidier, let’s define a few constants and types for the model input and output.

public static class LandmarkModelSettings
    public const string OnnxModelName = "lite-model_on_device_vision_classifier_landmarks_classifier_north_america_V1_1.onnx";
    public const string Input = "uint8_image_input";
    public const string Output = "transpose_1";

    public const string MlNetModelFileName = "";
    public const string LabelFileName = "landmarks_classifier_north_america_V1_label_map.csv";

public class LandmarkInput
    public const int ImageWidth = 321;
    public const int ImageHeight = 321;

    public LandmarkInput(Stream imagesStream)
        Image = MLImage.CreateFromStream(imagesStream);

    [ImageType(width: ImageWidth, height: ImageHeight)]
    public MLImage Image { get; }

public class LandmarkOutput
    public float[] Prediction { get; set; }
Enter fullscreen mode Exit fullscreen mode

The code is relatively self explanatory. What I’d like to point out is that on the tfhub page, where we’ve downloaded the model from, there’s also a .csv file with the labels for each prediction result. I saved it locally with the name from the constant LabelFileName.

Now we’re ready to describe the inputs and outputs from ML.NET, we’re also loading the ONNX model and saving it in ML.NET’s own format

// Configure ML model
var mlCtx = new MLContext();

var pipeline = mlCtx
    // Adjust the image to the required model input size
        inputColumnName: nameof(LandmarkInput.Image),
        imageWidth: LandmarkInput.ImageWidth,
        imageHeight: LandmarkInput.ImageHeight,
        outputColumnName: "resized"
    // Extract the pixels form the image as a 1D float array, but keep them in the same order as they appear in the image.
        inputColumnName: "resized",
        interleavePixelColors: true,
        outputAsFloatArray: false,
        outputColumnName: LandmarkModelSettings.Input)
    // Perform the estimation
            modelFile: "./" + LandmarkModelSettings.OnnxModelName,
            inputColumnName: LandmarkModelSettings.Input,
            outputColumnName: LandmarkModelSettings.Output

// Save ml model
var transformer = pipeline.Fit(mlCtx.Data.LoadFromEnumerable(new List<LandmarkInput>()));

mlCtx.Model.Save(transformer, null, LandmarkModelSettings.MlNetModelFileName);
Enter fullscreen mode Exit fullscreen mode

Let’s walk through the code.

First we’re telling ML.NET to resize any image it receives to the size that the downloaded model expects; in this case 321x321 pixels. The resized image should be placed in the “resized” column. From the “resized” column we’re extracting the image pixels into a 1D array of floats and we’re outputting that data into the column **transpose_1 **because that’s what the model expects. In the last step we’re invoking the model to make the prediction.

Finally, the model is saved with the name It now shrank even more to 39.6MB.

Load the ML.NET Model and Make a Prediction

Before we continue, we should make sure that the ML.NET model actually works as expected. For this we’re loading the model from the ** **file and we’re feeding it a .jpg file with the statue of liberty. The prediction should contain multiple entries, but the one with the highest probability should be our statue of liberty.

// Load ml model
var mlCtx2 = new MLContext();
var loadedModel = mlCtx2.Model.Load(LandmarkModelSettings.MlNetModelFileName, out var _);
var predictionEngine = mlCtx2.Model.CreatePredictionEngine<LandmarkInput, LandmarkOutput>(loadedModel);

// Predict 
var sw = new Stopwatch();
await using var imagesStream = File.Open("Landmarks/Statue_of_Liberty_7.jpg", FileMode.Open);
var prediction = predictionEngine.Predict(new LandmarkInput(imagesStream));
Console.WriteLine($"Prediction took: {sw.ElapsedMilliseconds}ms");

// Labels start from the second line and each contains the 0-based index, a comma and a name.
var labels = await File.ReadAllLinesAsync(LandmarkModelSettings.LabelFileName)
    .ContinueWith(lineTask =>
        var lines = lineTask.Result;
        return lines
            .Select(line => line.Split(",").Last())

// Merge the prediction array with the labels. Produce tuples of landmark name and its probability.
var predictions = prediction.Prediction
        .Select((val, index) => (index, probabiliy: val))
        .Where(pair => pair.probabiliy > 0.55f)
        .Select(pair => (name: labels[pair.index], pair.probabiliy))
        .GroupBy(pair =>
        .Select(group => (name: group.Key, probability: group.Select((p) => p.probabiliy).Max()))
        .OrderByDescending(pair => pair.probability)

// Output
var predictionsString = string.Join(Environment.NewLine, predictions.Select(pair => $"name: {}, probability: {pair.probability}"));
Console.WriteLine(string.Join(Environment.NewLine, predictionsString));
Enter fullscreen mode Exit fullscreen mode

As you might have noticed, I had a picture named Statue_of_Liberty_7.jpg in my Landmarks folder.

What’s custom for this model is that the prediction contains duplicates, i.e. the output array of floats contains multiple entries for the same landmark. The tfhub documentation page says to just use that prediction of a landmark that has the highest probability. Depending on the model that you chose, you might not need to do this and simply assigning a label to each position from the output might be enough.

On my M1 Pro Macbook Pro the output looks like this

Prediction took: 108ms
name: Liberty Island, probability: 0,9176943
name: New York Harbor, probability: 0,798547
name: Liberty State Park, probability: 0,7981717
name: The Terminal Tower Residences, probability: 0,6972256
Enter fullscreen mode Exit fullscreen mode

Congrats 🎉, you’re ready to use the model! Read on if you want to integrate it in an AspNet.Core application.

Expose it as a Web API

Create a new AspNet.Core Web API project and add the following package references

<PackageReference Include="Microsoft.AspNetCore.OpenApi" Version="7.0.9"/>
<PackageReference Include="Microsoft.Extensions.ML" Version="2.0.1" />
<PackageReference Include="Microsoft.ML" Version="2.0.1" />
<PackageReference Include="Microsoft.ML.OnnxRuntime" Version="1.15.1" />
<PackageReference Include="Microsoft.ML.OnnxTransformer" Version="2.0.1" />
Enter fullscreen mode Exit fullscreen mode

The last ones are necessary only at runtime. If you skip them you’ll get errors like

System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation.
— -> System.IO.FileNotFoundException: Could not load file or assembly ‘Microsoft.ML.OnnxTransformer, Version=, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51’. The system cannot find the file specified.

File name: ‘Microsoft.ML.OnnxTransformer, Version=, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51’


An unhandled exception has occurred while executing the request.
System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation.
— -> System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation.
— -> System.TypeInitializationException: The type initializer for ‘Microsoft.ML.OnnxRuntime.NativeMethods’ threw an exception.
— -> System.DllNotFoundException: Unable to load shared library ‘onnxruntime’ or one of its dependencies. In order to help diagnose loading problems, consider setting the DYLD_PRINT_LIBRARIES environment variable:

You might expect that loading the model looks exactly like we did before when we made our first prediction, but Microsoft recommends something different. Since the PredictionEngine isn’t thread safe and expensive to create, we should use a **PredictionEnginePool **—link to the docs.

Make sure to have the constants, input and output types available and add this line in to the Program.cs

    .AddPredictionEnginePool<LandmarkInput, LandmarkOutput>()
Enter fullscreen mode Exit fullscreen mode

It add a PredictionEnginePool for our input and output types for the ML.NET model from the specified path.

You can organize your code as is best for your project, but I added a dedicated singleton service that loads the labels and sorts them

internal class NorthAmericanLabelProvider : INorthAmericanLabelProvider
    private Lazy<string[]>? _lazyLabels;

    public string[] GetLabels()
        _lazyLabels ??= new Lazy<string[]>(() =>
            var labelFilePath = Path.Combine(
            var labelLines = File.ReadAllLines(labelFilePath);
            return labelLines.Skip(1)
                .Select(line => line.Split(","))
                .Select(lineTokens => (Index: int.Parse(lineTokens[0]), LandmarkName: lineTokens[1]))
                .OrderBy(tuple => tuple.Index)
                .Select(tuple => tuple.LandmarkName)

        return _lazyLabels.Value;
Enter fullscreen mode Exit fullscreen mode

It reads the content of the file, parses and sorts it only once using a Lazy.

Next, I created a service called NorthAmericanLandmarkPredictor that does the actual prediction. It makes use of the PredictionEnginePool that we’re registered earlier and of the INorthAmericanLabelProvider

internal class NorthAmericanLandmarkPredictor : INorthAmericanLandmarkPredictor
    private readonly PredictionEnginePool<LandmarkInput, LandmarkOutput> _predictionEnginePool;
    private readonly INorthAmericanLabelProvider _northAmericanLabelProvider;

    public NorthAmericanLandmarkPredictor(PredictionEnginePool<LandmarkInput, LandmarkOutput> predictionEnginePool, INorthAmericanLabelProvider northAmericanLabelProvider)
        _predictionEnginePool = predictionEnginePool;
        _northAmericanLabelProvider = northAmericanLabelProvider;

    public List<LandmarkPrediction> PredictLandmark(Stream imageStream)
        var labels = _northAmericanLabelProvider.GetLabels();

        // Make prediction
        // Post process prediction - the output contains duplicates, so we should group by label and take the entry with the highest probability.
        // Docs -
        var landmarkOutput = _predictionEnginePool.Predict(new LandmarkInput(imageStream));
        return landmarkOutput.Prediction
            .Zip(labels, (probability, landmarkName) => (LandmarkName: landmarkName, Probability: probability))
            .GroupBy(tuple => tuple.LandmarkName)
            .Select(group => new LandmarkPrediction(
                group.MaxBy(tuple => tuple.Probability).Probability
            .OrderByDescending(prediction => prediction.Probability)
Enter fullscreen mode Exit fullscreen mode

Finally, in a controller we can inject the INorthAmericanLandmarkPredictor and make predictions from uploaded images

public class LandmarkPredictionController : ControllerBase
    private readonly INorthAmericanLandmarkPredictor _northAmericanLandmarkPredictor;

    public LandmarkPredictionController(INorthAmericanLandmarkPredictor northAmericanLandmarkPredictor)
        _northAmericanLandmarkPredictor = northAmericanLandmarkPredictor;

    public async Task<List<LandmarkPrediction>> Get(IFormFile image)
        var prediction = _northAmericanLandmarkPredictor.PredictLandmark(image.OpenReadStream());
        return prediction;
Enter fullscreen mode Exit fullscreen mode

In the built in Swagger UI this looks like this

Statue of Liberty Prediction Results


That’s all folks!

Now you’re ready to operationalize many different ml models with ML.NET and expose them in a nice Web API.

You can find the whole source code at

It is under MIT license so you’re free to use it at your heart’s content.

Hit me up on Twitter/X if you’d like to buy me a coffee 😁

Top comments (0)