DEV Community

martinbald81
martinbald81

Posted on

ML Platform Deployment with Helm Charts

It is without a doubt an exciting time to be in the AI and ML space with the developments in Generative AI space. We see great strides in object detection for computer vision from the team at Deci through the release of YOLO-NAS helping to drive better accuracy and latency performance.

Excitement around LLM has sparked conversations around deployment to production with a new term called LLMOps branching out of the MLOps discipline. When deploying AI and ML models to production there are a number of things to consider such as:

  • The operational infrastructure
  • Model serving
  • Monitoring
  • Validation
  • Observation

If we take operational infrastructure: will there be a model per cluster or multiple models per cluster, will the inferencing be streaming or batch mode?

Getting ML projects to production is one challenge. Keeping the deployments running with optimal efficiency and ROI to the business is the additional challenge and overhead for Ops teams. This is where Helm Charts can help manage your Kubernetes clusters. Helm Charts help you manage and deploy applications on Kubernetes platforms. They allow you to define, install, and upgrade complex applications in a declarative way, using templates, variables, and dependencies. Helm has thousands of YAML files that are ready to use with different configurations for different deployment scenarios to help with ML model deployment.

How It Works

To configure any Kubernetes application, there must be a YAML file that sets up everything (the configuration file).

Helm Charts, Kubenetes YAML Configuration File

The following software must be installed in the system where the Kubernetes environment is being managed - aka where kubectl will be installed. This requires the following software be installed to manage the Kubernetes environment:

Kubectl is used for the various Kubernetes commands such as creating a namespace, showing the pods and such. Helm can be installed on various platforms such as Windows, Mac OS and Linux and you can find those instructions in the link above. Krew is a plug-in for Kubernetes which for our purposes we will use preflight and support bundle.

Helm provides many different values which are stored in YAML files. These are all the things needed such as image registry, where it gets its images from, CPU, memory, pipeline limits, lists of containers and settings etc.

Once you have ran the install commands in the list above, the first step in the Wallaroo installation process via Helm is to connect to the Kubernetes environment that will host the Wallaroo Enterprise instance and login into the Wallaroo container registry through the command provided by the Wallaroo support staff. The command will take the following format, replacing $YOURUSERNAME and $YOURPASSWORD with the respective username and password provided.

Code: helm registry login registry.replicated.com --username $YOURUSERNAME --password $YOURPASSWORD

The next step is preflight verification which will verify that our environment is valid and works well for what we are trying to install. For example it will check if we have enough CPUs, is the containerd installed etc. The output will show that everything passed inspection as seen in the below screenshot.

Code: kubectl preflight --interactive=false preflight.yaml

kubectl preflight results output showing pass for all variables

Now in the case of Wallaroo we have a browser component that is being revealed to the public so we will need some certificates so we can validate that the server is accurate and correct so we can avoid any man in the middle attacks. To do this we will generate a Kubectl secret where we already have our certificates and private keys all set. You can check our DNS documentation for those settings.

Create the Kubernetes secret from the certificates created in the previous step, replacing $TLSCONFIG with the name of the Kubernetes secret. Store the secret name for the step Configure local values file.

Code: kubectl create secret tls $TLSCONFIG --cert=$TLSSECRETS --key=$TLSSECRETS

For example if $TLSCONFIG is my-tls-secrets with example.com.crt and key example.com.key, then the command would be translated as

Code: kubectl create secret tls my-tls-secrets --cert=example.com.crt --key=example.com.key

Next we want to configure the local values file. The default Helm install of Wallaroo contains various default settings. The local values file overwrites values based on the organization needs. The following represents the minimum mandatory values for a Wallaroo installation using certificates and the default LoadBalancer for a cloud Kubernetes cluster. The configuration details below are saved as local-values.yaml for these examples.

For information on taints and tolerations settings, see the Taints and Tolerations Guide.

Note the following required settings:

  • domainPrefix and domainSuffix: Used to set the DNS settings for the Wallaroo instance. For more information, see the Wallaroo DNS Integration Guide.
  • deploymentStage and custTlsSecretName: These are set for use with the Kubernetes secret created in the previous step. External connections through the Wallaroo SDK require valid certificates.
  • generate_secrets: Secrets for administrative and other users can be generated by the Helm install process, or set manually. This setting scrambles the passwords during installation.
  • apilb: Sets the apilb service options including the following:
    • serviceType: LoadBalancer: Uses the default LoadBalancer setting for the Kubernetes cloud service the Wallaroo instance is installed into. Replace with the specific service connection settings as required.
    • external_inference_endpoints_enabled: true: This setting is required for performing external SDK inferences to a Wallaroo instance. For more information, see the Wallaroo Model Endpoints Guide

Taints and tolerations configuration settings

In the case of Wallaroo the resource used by the services can be modified. Wallaroo uses different nodes for various services, which can be assigned to a different node pool to contain resources separate from other nodes. The following nodes selectors can be configured:

  • ML Engine node selector
  • ML Engine Load Balance node selector
  • Database Node Selector
  • Grafana node selector
  • Prometheus node selector

For full details you can check out the Wallaroo Helm References Guides.

The final steps are to install Wallaroo and verify the installation. For installing Wallaroo our team would provide the installation command for the Helm install that will use the container registry assuming preflight checks have passed as shown earlier. This Helm install command is as follows

Code: helm install $RELEASE $REGISTRYURL --version $VERSION--values $LOCALVALUES.yaml

Where:

  • $RELEASE: The name of the Helm release. By default, wallaroo.
  • $REGISTRYURL: The URl for the Wallaroo container registry service.
  • $VERSION: The version of Wallaroo to install. For this example, 2022.4.0-main-2297.
  • $LOCALVALUES: The .yaml file containing the local values overrides. For this example, local-values.yaml.

For example, for the registration wallaroo the command would be:

Code: helm install wallaroo oci://registry.replicated.com/wallaroo/EE/wallaroo --version 2022.4.0-main-2297 --values local-values.yaml

Once the installation is complete you can verify the installation using the helm test $RELEASE command. In our example it will be

Code: helm test wallaroo

A successful installation resembles the output below.

Successful output showing installation verification results

We have seen that Helm Charts are a great way to deploy applications on Kubernetes platforms in a consistent and reliable way. They enable you to automate and simplify complex deployments using templates, variables, dependencies, and hooks. They also allow you to share your charts with others and reuse existing ones from official or community repositories. In the AI and ML deployment space this helps teams not only deploy and manage machine learning clusters reliably and efficiently for production, but also help set up pre-production environments that are consistent with the production environment.

To learn more about deploying successful ML projects to production check out our Free Community Edition and hands on Tutorials.

Top comments (0)