DEV Community

wjiuhe
wjiuhe

Posted on

Serve machine learning models with Pinferencia

When reading this post, you perhaps have already known or tried torchserve, triton, seldon core, tf serving, even kserve. They are good products. However, if you are not using a very simple model or you have written many codes and the model is just a part of it. It is not that easy to integrate your codes with them.
Here, you have an alternative: Pinferencia (More tutorial, please visit:https://pinferencia.underneathall.app/)

Github: Pinferencia - If you like it, give it a star.

Install

pip install "pinferencia[uvicorn]"
Enter fullscreen mode Exit fullscreen mode

Quick Start

Serve Any Model

app.py

from pinferencia import Server


class MyModel:
    def predict(self, data):
        return sum(data)


model = MyModel()

service = Server()
service.register(
    model_name="mymodel",
    model=model,
    entrypoint="predict",
)
Enter fullscreen mode Exit fullscreen mode

Just run:

uvicorn app:service --reload
Enter fullscreen mode Exit fullscreen mode

Hooray, your service is alive. Go to http://127.0.0.1:8000/ and have fun.

You will have a full API documentation page to play with:
swagger ui

You can test your model right here:

try it out

Any Deep Learning Models? Just as easy. Simply train or load your model, and register it with the service. Go alive immediately.

Pytorch

import torch

from pinferencia import Server


# train your models
model = "..."

# or load your models (1)
# from state_dict
model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH))

# entire model
model = torch.load(PATH)

# torchscript
model = torch.jit.load('model_scripted.pt')

model.eval()

service = Server()
service.register(
    model_name="mymodel",
    model=model,
)
Enter fullscreen mode Exit fullscreen mode

Tensorflow

import tensorflow as tf

from pinferencia import Server


# train your models
model = "..."

# or load your models (1)
# saved_model
model = tf.keras.models.load_model('saved_model/model')

# HDF5
model = tf.keras.models.load_model('model.h5')

# from weights
model = create_model()
model.load_weights('./checkpoints/my_checkpoint')
loss, acc = model.evaluate(test_images, test_labels, verbose=2)

service = Server()
service.register(
    model_name="mymodel",
    model=model,
    entrypoint="predict",
)
Enter fullscreen mode Exit fullscreen mode

Any model of any framework will just work the same way. Now run uvicorn app:service --reload and enjoy!

Top comments (0)