Bravin Wasike

Posted on May 19, 2023 • Edited on May 26, 2023 • Originally published at section.io

How to Build a Machine Learning App with FastAPI: Dockerize and Deploy the FastAPI Application to Kubernetes

#devops #kubernetes #machinelearning #fastapi

Prerequisites
Building the machine learning model
Introduction to the FastAPI
Dockerizing the FastAPI application
Deploying the FastAPI application to Kubernetes cluster
Conclusion
References

Prerequisites

You must have a good understanding of Python.
You must have an excellent working knowledge of machine learning models.
You must have Docker installed in your machine.
You must have Kubernetes installed in your machine.
Know how to use Google Colab or Jupyter Notebook. In this tutorial, we shall use Google Colab in building our model.

Note: For you to follow along easily, use Google Colab. It's an easy-to-use platform to get started quickly while building models.

Building the machine learning model

You will build a machine learning model that will predict the nationality of individuals using their names. This is a simple model that will explain the key concepts used in machine learning modeling.

Dataset to be used

The dataset used will contains common names of people and their nationalities. The data used is as shown:

CSV File of data

Installation the Python packages

You will use the following packages when building the model:

Pandas

Pandas is a software library written for the Python programming language for data manipulation and analysis.

It's a tool for reading and writing data between in-memory data structures and different file formats.

NumPy

Numpy is the fundamental package for scientific computing in Python. NumPy arrays facilitate advanced mathematical and other types of operations on large numbers of data.

Sckit-learn

Scikit-learn is an open-source software machine learning library for the Python programming language. It consists of various classification, regression, and clustering algorithms, including support vector machines, random forests, gradient boosting, k-means, and linear regression.

Run the following commands to install the packages:



pip install pandas
pip install numpy
pip install sklearn

Loading the Exploratory Data Analysis (EDA) Packages

These packages are used for Exploratory Data Analysis (EDA) to summarise the main characteristics of the data for easy visualization.

It helps determine how best to manipulate data sources to get the answers you need, making it more accessible in discovering patterns, spot anomalies, test a hypothesis, and check for assumptions.

pandas is a library written for the Python programming language for data manipulation and analysis.
NumPy is the fundamental package for scientific computing in Python. NumPy arrays facilitate advanced mathematical and other types of operations on large numbers of data.



import pandas as pd
import numpy as np

Loading from Scikit-learn package

Scikit-learn will be the package used for predictive analysis since it contains different tools for machine learning modeling and various algorithms for classification, regression, and clustering.



import MultinomialNB from sklearn.naive_bayes
import CountVectorizer from sklearn.feature_extraction.text
import train_test_split from sklearn.model_selection
import accuracy_score from sklearn.metrics

In the above code, we have imported the following:

MultinomialNB

This is the classifier method that is found in the Naive Bayes algorithm. You will use MultinomialNB to build the model. It is based on Bayes' theorem, which is easy to build and particularly useful for enormous datasets. Along with simplicity, Naive Bayes is known to outperform even highly sophisticated classification methods.

Naïve Bayes classifiers are highly scalable algorithms that require several features when building a classification model. In our case, you use the MultinomialNB method from the Naive Bayes algorithm since its suitable for classification with discrete features, which is the case for our model.

To further read about the Naive Bayes algorithm and how it's useful in performing classification click here.

CountVectorizer

It is used to fit your model into the inputs of the dataset. CountVectorizer also transforms your dataset into vectors which are more readable inputs. The model then uses the dataset during the training phase. It is also used to extract features from the dataset. Features are the inputs for the model.

For more details about CountVectorizer click here.

train_test_split

This is what is used in splitting the dataset. The dataset will be split into train_set and test_set.

accuracy_score

It is used to measure the model's accuracy in percentage and gauge the model performance during the training phase.

You will use the Naive Bayes Classifier for the modeling. In this tutorial, you will choose the Naive Bayes Classifier algorithm for the classification instead of the other algorithms for the following reasons:

It's simple and easy to implement.
It tends to give a higher accuracy as compared to the other algorithm.
Naive Bayes is fast during training as compared to other algorithms.
Other algorithms tend to memorize rather than learn, unlike Naive Bayes, which ensures that a model learns during training.

Other common algorithms used are as follows:

Loading our dataset

You use the pandas package to import our nationality.csv dataset. You also use pandas for data manipulation and data analysis.



df = pd.read_csv("nationality.csv")

Nature of our data

You need to understand the nature of the dataset. For example, you need to know the number of names in the dataset, the columns, and the rows present in the data.



df.shape

The output is as shown. This shows the size of our dataset.



(3238, 3)



df.head

This shows that our dataset has two columns: the names and nationality columns.



Unnamed: 0  names nationality
0       0   Louane  french
1       1   Lucien  french
2       2   Yamazaki japanese
3       3   Zalman  yiddish
4       4   Zindel  yiddish



df.columns

The output will show the available columns in the dataset.



Index(['Unnamed: 0', 'names', 'nationality'], dtype='object')

All the nationalities available in the data



df['nationality'].unique()

The output gives an array of all the nationalities available in the dataset, as shown below.



array(['yiddish', 'gaelic', 'african', 'irish', 'hungarian', 'german',
       'swedish', 'japanese', 'italian', 'american', 'hawaiian', 'greek',
       'polynesian', 'scandinavian', 'spanish', 'celtic', 'old-english',
       'korean', 'sanskrit', 'african-american', 'hebrew', 'norse',
       'chinese', 'finnish', 'persian', 'scottish', 'slavic', 'english',
       'old-norse', 'dutch', 'armenian', 'welsh', 'polish', 'teutonic',
       'russian', 'egyptian', 'arabic', 'swahili', 'native-american',
       'old-french', 'french', 'middle-english', 'latin', 'vietnamese',
       'danish', 'hindi', 'old-german', 'turkish', 'indian',
       'czechoslovakian'], dtype=object)

Checking if the data is balanced

This shows the available number of names in each nationality. The nationalities should have almost the same number of names to ensure that the model is well trained.
As you can see, most of the nationalities have a total of 100 names.



df.groupby('nationality')['names'].size()

The output of the nationalities.



nationality
african             100
african-american    100
american            100
arabic              100
armenian             17
celtic               62
chinese             100
czechoslovakian      38
danish               11
dutch                24
egyptian             30
english             100
finnish              13
french              100
gaelic               87
german              100
greek               100
hawaiian            100
hebrew              100
hindi               100
hungarian            64
indian               25
irish               100
italian             100
japanese            100
korean               16
latin               100
middle-english       45
native-american     100
norse                40
old-english         100
old-french           46
old-german           40
old-norse            28
persian              55
polish               48
polynesian           15
russian              85
sanskrit             28
scandinavian        100
scottish             74
slavic               79
spanish             100
swahili              16
swedish              14
teutonic             32
turkish              52
vietnamese           52
welsh                91
yiddish              11
Name: names, dtype: int64

Visualizing the data using the Matplotlib library

Matplotlib is a Python library used for plotting hence easy visualization of the data in the form of a graph.

In this tutorial, you use Google Colab. Run the code snippet below on Google Colab so that you can import Matplotlib.



import matplotlib.pyplot as plt
%matplotlib inline



df.groupby('nationality')['names'].size().plot(kind='bar',figsize=(20,15))

Our bar graph is as shown:

Checking the features

Xfeatures are individual independent variables that act as input in your system. While making the predictions, models use such features to make the predictions.
ylabels will be used as outputs when making predictions.



Xfeatures = df['names']
ylabels= df['nationality']

Vectorizing the features

You will use the CountVectorizer() method to transform the dataset into readable inputs to be used by the model. This method also extracts features from the dataset.



vec = CountVectorizer()
X = vec.fit_transform(Xfeatures)

You also need to initialize the get_feature_names() method, which is used to get features of the system.



vec.get_feature_names()

Splitting the data

You need to split the dataset into train_test and test_test. We use 70% of the data to train the model and 30% for testing.



x_train,x_test,y_train,y_test = train_test_split(X,ylabels,test_size=0.30)

Building the model

You fit the model to the dataset using the fit() method:



nb = MultinomialNB()
nb.fit(x_train,y_train)

Checking the accuracy of the model

You check the accuracy score of the model to know how well you trained the model. The higher the accuracy, the better you trained the model.



nb.score(x_test,y_test)

Our accuracy score is:



0.85036482694119869

This is about 85.04% accuracy.

Making predictions

After training the model, you can now feed our model with new inputs to start making predictions. The model will make accurate predictions based on how well you trained it. Therefore, the higher the accuracy score, the better the model will be in making predictions.



name1 = ["Yin","Bathsheba","Brittany","Vladmir"]
vector1 = vec.transform(sample1).toarray()
nb.predict(vector1)

Saving our model using joblib

You will use joblib to save the model into a pickle file. Pickling the model makes it easier to use the model in the future without repeating the training process.
A pickle file is a byte stream of the model.

To use joblib, you have to import the package from sklearn.externals. Here is a detailed article that helps a reader fully grasp the use and functionalities of joblib.



import joblib from sklearn.externals
nationality_predictor = open("naive_bayes.pkl","wb")
joblib.dump(cv,nationality_vectorizer)
nationality_predictor.close()

You will name the pickle file naive_bayes.pkl.

Introduction to the FastAPI

FastAPI is a modern, fast web framework for building APIs with Python 3.6+, based on standard Python-type hints. The key features for FastAPI are as follows:

Fast to code: Increases the speed of developing new features.
Fewer bugs: Reduce developer induced errors.
Intuitive: Has great editor support, completion everywhere, and less time debugging.
Easy: Designed to be easy to use and learn.
Short: Minimize code duplication with multiple features from each parameter declaration.
Robust: Get production-ready code with automatic interactive documentation.
Standards-based: Based on the open standards for APIs.

This makes Fast API potent since it combines the functionalities of best frameworks such as flask and swagger.

Installing FastAPI

Use the following commands to install FastAPI into your machine.



pip install fastapi

Let's install the server.

univicorn is a server that is used to run FastAPI. First, we specify the standard version of univicorn, which contains minimal dependencies. This version contains pure Python dependencies.
And is best suited for our model since we deal with the core Python packages and dependencies used to build our model.



pip install uvicorn[standard]

Creating the API

First, create a new Python file and name it main.py. Then, add the pickle file naive_bayes.pkl in a new folder.

The folder structure:



├── app.py
├── model
   ├── naive_bayes.pkl

Importing our FastAPI packages



import uvicorn
import FastAPI, Query from fast API

Loading ML packages

You will use joblib to unpickle the previously pickled file, convert the serialized model back to its original form.



import joblib from sklearn.externals

Unplickling our Naive Bayes classifier file

To use the saved model, you need to convert it back to the original object. This allows you to use the model in the original form you had created.



nationality_naive_bayes = open("model/naive_bayes.pkl","rb")
nationality_cv = joblib.load(nationality_naive_bayes)

Initializing our app

You initialize the model using the FastAPI() method:



app = FastAPI()

Creating the routes

You will create a simple route that will run on localhost port 8000. To create the route, you will use the concept of Asynchronous programming in creating routes.

Asynchronous programming allows a program to run multiple operations without waiting for other operations to complete.
This is an important concept in any programming language since it allows multiple operations to run parallel without blocking each other. Asynchronous programming is an advanced concept that has become very important in the Python language. For detailed guidance on this concept, this article is very helpful.

You will use the async function when creating our FastAPI routes. This enables the FastAPI to create multiple routes concurrently.

To make the first route, you use the async def index() function to makes the index route, which will run on localhost port 8000.



@app.get('/')
async def index():
  return {"text":"Our First route"}
if __name__ == '__main__':
uvicorn.run(app,host="127.0.0.1",port=8000)

The above routes are used to show how to make a simple index route using the FastAPI. Now you will add more routes for our machine learning model.

Adding route for our machine learning logic

You will add a get route for making nationality predictions. The following function can also be used to make predictions. For example, you use the predict_nationality() method to make predictions about someone's nationality. You also need to convert the data inputs into an array using the toarray() to return a list of the nationalities available in the dataset.



def predict_nationality(x):
  vect = nationality_cv.transform(data).toarray()
  result = nationality_clf.predict(vect)
  return result

Adding a route to make predictions

You will use this route to get the ethnicity of a person based on the name input by the user. You need to send a GET request to the predict route to get the prediction. You also need to include the predict() method to query the route and return a prediction result:



@app.get('/predict/{name}')
async def predict(name: str = Query(None, min_length=2, max_length=12)):
  if request.method == 'GET':
    namequery = request.form['namequery']
    data = [namequery]
    vect = nationality_cv.transform(data).toarray()
    result = nationality_cv.predict(vect)
    return {"orig_name": name, "prediction": result}

Make sure to include this in your file to specify the port that will serve your app. For example, this will enable your route to run on localhost port 8000.



if __name__ == '__main__':
  uvicorn.run(app,host="127.0.0.1",port=8000)

Our output is as shown:

Interactive API docs: http://127.0.0.1:8000/docs

The route to be used to make a prediction:

You have finally served your machine learning model as API using the FastAPI.

Dockerizing the FastAPI application

It involves creating a Docker Container for the FastAPI application. A Docker Container is a standard unit of software that packages up code and all its dependencies, so the application runs quickly and reliably from one computing environment to another.

Docker container image is a lightweight, standalone, executable package of software that includes everything needed to run an application: code, runtime, system tools, system libraries, and settings.

Container images become containers at runtime, and in Docker containers, images become containers when they run on Docker Engine.
To create a docker container, we have to use the following steps.

Create a Docker file

In your working directory, create a DockerFile.

Your working directory is as shown below:



├── app.py
├── Dockerfile
├── model
   ├── naive_bayes.pkl

Creating Docker Layers

Docker Layers are what compose the file system for both Docker images and Docker containers. Each layer corresponds to certain instructions in your Dockerfile. For example, in the Dockerfile, you have instructions. The instructions are shown below, from defining our base image to creating an entry point to execute the image.

If these steps are followed,you will end up with a Docker image. The steps are as follows.

Define base image

A base image is an image that is used to create all of your container images. Here you will use Python as the base image.



FROM tiangolo/uvicorn-gunicorn-fastapi:python3.7

Create a working directory



WORKDIR /app

Copy the app into the new directory created



COPY ./app /app

Install in the new working directory



RUN pip install fastapi uvicorn

Expose the port to serve your application

Docker will run on port 8000.



EXPOSE 8000

Create an entry point to be used to execute your image



ENTRYPOINT ["uvicorn", "app:app --reload"]
CMD ["uvicorn", "app.app:app", "8000"]

Create Docker image

A Docker image contains application code, libraries, tools, dependencies, and other files needed to make an application run.



docker build -t fastapi-test-app:new .

The output is as shown

This output shows the process used when creating a docker image.



Sending context building to the Docker daemon  34.90kb
Step 1/7 : FROM tiangolo/uvicorn-gunicorn-fastapi:python3.7
  --->db183g656y4h
Step 2/7 : WORKDIR /app
  --->Using Cache
  --->5df25yffdpbc
Step 3/7 : COPY ./app /app
    --->Using Cache
    --->25dffbfdjdf5
Step 4/7 : RUN pip install fastapi uvicorn
    --->Using Cache
    --->edf81dffcdf5
Step 5/7 : EXPOSE 8000
    --->Using Cache
    --->afd99eb62d2
Step 6/7 : ENTRYPOINT ["uvicorn", "app:app --reload"]
    --->07taebte2egd
Removing intermediate container 4edte5ta382
 ---> 2de6fstf5uv09
step 7/7 : CMD ["uvicorn", "app.app:app", "8000"]
Successfully built 2de6fstf5uv09
Successfully tagged fastapi-test-app:new

Listing all of our created images

To list all the docker images you had created earlier, you can use the following command.
Our latest image is fastapi-test-app. This is the image you have just created with an id of 2de6fstf5uv09.



docker image ls

Output:



REPOSITORY                   TAG                 IMAGE ID            CREATED             SIZE
fastapi-test-app             new                2de6fstf5uv09      3 minutes ago       1.34GB
testing                      latest             d661f1t3e0b         2 weeks ago          994MB

Creating docker container

Docker containers are the live, running instances of Docker images, users can interact with them, and administrators can adjust their settings and conditions using docker commands.



docker run -p 8000:8000 fastapi:new

Result:



e0f1bd4gv1f7t3dti5e89fd1o29341a50ete9hgad8ed0ye0ff27dt81667fu16b

After Dockerizing our FastAPI application, we now need to deploy it to Kubernetes Cluster.

Deploying the FastAPI application to Kubernetes cluster

Kubernetes is a container orchestration system that is used for the deployment of docker-created containers. It is meant to efficiently manage and coordinate clusters and workloads at a larger scale in a production environment.
Helps to manage containerized services through automation in deployment.

You will create a new file called deployment.yaml in your working directory.
Your folder structure is as shown:



├── app.py
├── Dockerfile
├── deployment.yaml
├── model
   ├── naive_bayes.pkl

The code snippet for the deployment.yaml file is as shown:



apiVersion: v1
kind: Service
metadata:
  name: fastapi-test-service
spec:
  selector:
    app: fastapi-test-app
  ports:
    - protocol: "TCP"
      port: 3000
      targetPort: 8000
  type: LoadBalancer
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: fastapi-test-app
spec:
  selector:
    matchLabels:
      app: fastapi-test-app
  replicas: 5
  template:
    metadata:
      labels:
        app: fastapi-test-app
    spec:
      containers:
        - name: fastapi-test-app
          image: fastapi-test-app
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 8000

The file has two sections:

Service: Acts as the load balancer. A load balancer is used to distribute different sets of tasks to the various available servers in the network to maximize the usage of the available resources.
Deployment: This is the intended application that we want to deploy to the Kubernetes engine. The user will then send a request to the load balancer in the service. Then the load balancer distributes the request by creating the number of replicas defined in the deployment.yaml file. Here, you are using five replicas for scalability. Hence there will be five instances of the application running at a time.

When you have various replicas, it creates redundancy so that if one instance fails, the others will continue running.

The deployment.yaml file is connected to the Docker image created earlier. In the deployment.yaml file, you will specify the image name created earlier.

Deployment of our application to Kubernetes cluster

You have dockerized our FastAPI application. You will now deploy it to a Kubernetes engine.

Run the following command:



kubectl apply -f deployment.yaml

This command will deploy your service and application instances created above to the Kubernetes engine. After running this command, the fastapi-test-service and the fastapi-test-app are created.

Deployment dashboard

Minikube and Kubernetes provide a dashboard that is used to visualize the deployment. To see the deployed container in the dashboard, you use the following command:



minikube dashboard

Your dashboard will be as shown:

Accessing your application

You access your application using the following command:



minikube start service: fastapi-test-service

Therefore you have deployed your Containerised FastAPI application to the Kubernetes cluster.

Conclusion

In this tutorial, you have learned how to create a machine learning model. You have followed all the steps from data pre-processing to train and build your model finally. You have also learned about the FastAPI, which is an efficient library for making WebAPIs. The FastAPI has helped you to serve our machine learning model as an API.

You then containerized our fast API application using docker. Finally, you deployed the application to the Kubernetes cluster. Using these steps, a reader should comfortably build a FastAPI application and deploy it to the Kubernetes cluster.

If you like this tutorial, let's connect on Twitter and LinkedIn. Thanks for Reading and Happy Learning!

Table of contents