DEV Community

Cover image for How to create a Kubernetes Operator ?
Maxime Guilbert
Maxime Guilbert

Posted on

How to create a Kubernetes Operator ?

In the first part of this serie about the operator pattern, we saw what it is and in which cases it can be highly helpful, especially for automation.

So today we will see how to create a Kubernetes Operator!


As every open-source solution, if you need to do something, a bunch of tools exists with their own specificities. If you want to see the list, go check the Kubernetes documentation.

In this serie, we will use Operator Framework and KubeBuilder.

Operator Framework

A few words about Operator Framework, we will use the Go SDK but you need to know that you can also use it with Ansible and Helm.



If you are using Homebrew, you can install the Operator Framework SDK with the following command :

brew install operator-sdk
Enter fullscreen mode Exit fullscreen mode

From Github Release

# Define informations about your platform
export ARCH=$(case $(uname -m) in x86_64) echo -n amd64 ;; aarch64) echo -n arm64 ;; *) echo -n $(uname -m) ;; esac)
export OS=$(uname | awk '{print tolower($0)}')

# Download the binary for your platform
curl -LO ${OPERATOR_SDK_DL_URL}/operator-sdk_${OS}_${ARCH}

# Install the binary
chmod +x operator-sdk_${OS}_${ARCH} && sudo mv operator-sdk_${OS}_${ARCH} /usr/local/bin/operator-sdk
Enter fullscreen mode Exit fullscreen mode

Create our first operator

Initialize the project

The first thing to do is to initialize the project with the following command

operator-sdk init --domain [YOUR DOMAIN] --repo [YOUR CODE REPOSITORY]


operator-sdk init --domain --repo
Enter fullscreen mode Exit fullscreen mode

It will generate a folder structure like this

Image description

You will be able to find some generic files, a lot of common files (like the Makefile or Dockerfile) and the begining of your Golang project with main.go.

Note : By default your namespace is able to watch resources everywhere in the cluster.
So if you want to limit its vision, you can update the definition of the manager to add the Namespace option.
mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{Namespace: "dummy_ns"})

For more informations about the scope of an operator, please check the SDK documentation

Create an API, a Controller and a CRD

In a lot of cases when we use an operator we want to create a Custom Resource Definition which will be used as reference for our tasks.

In this tutorial we will create a custom resource MyProxy in the gateway group which, for each instance, will deploy a Nginx deployment.

Command to generate the code

operator-sdk create api --group gateway --version v1alpha1 --kind MyProxy --resource --controller
Enter fullscreen mode Exit fullscreen mode

Once executed, you can see two new folders : api and controllers.


In this folder, the only file that will interest us is myproxy_types.go. It's in this file where we will define all the fields that we need in our Spec, but it's also here we will define the Status structure!

For our example, we will just define a Name field in MyProxy Spec.

type MyProxySpec struct {  
   Name string `json:"name,omitempty"`  
Enter fullscreen mode Exit fullscreen mode

Important !! This file is used as base to build numerous yaml files for your operator. So, every modification in this file, execute both commands :
make manifests & make generate


In this folder, you will find every controllers related to your Custom Resources like myproxy_controller.go that we generated earlier. This folder is the central place about operations that your operator can do.

In every controller file, you will find two methods that we must update : Reconcile and SetupWithManager.

// SetupWithManager sets up the controller with the Manager.
func (r *MyProxyReconciler) SetupWithManager(mgr ctrl.Manager) error {  
   return ctrl.NewControllerManagedBy(mgr).  
Enter fullscreen mode Exit fullscreen mode

In this example (which is also our implementation), we can see :

ctrl.NewControllerManagedBy(mgr) which creates a new controller with basic options. (It's in this method where you can personalize your controller options like the number of reconciliations max you want in parallel)

For(&gatewayv1alpha1.MyProxy{}) will declare that we want the reconciliation to be triggered if a add/update/delete event happen on a specific kind of resource . (Here MyProxy) You can use it for each kind of resource you want to watch. (Useful if you want to expose dynamically all the deployments through a Nginx for example)

Owns(&appsv1.Deployment{}) is quite similar as For, so it will declare that we want the reconciliation to be triggered if a add/update/delete event happen. But it will also add a filter, because the reconciliation will only be triggered if the operator own the resource with the event. (So if you update another deployment, nothing will happend in your operator)


This method is the heart of your operator, and is the one which will be executed every time a reconciliation will be triggered.

But before dive into the function, there is an important thing to see before. Above the method, you can see some comments starting with //+kubebuilder. This comments defines the rights for your operator!

So it's really important to define them correctly! In our case, we need to add some rights to our operator to be able to read/create and update Deployments.

Each comment is defined as follow :

// +kubebuiler:rbac:groups=[group of the resource],resources=[resources name],verbs=[verbs]

The group of the resource must be only one value, but for resources name and verbs, you can define multiple values at once, joining all the values with ;.

// +kubebuilder:rbac:groups=apps,resources=deployments,verbs=get;list;watch;create;update;patch;delete  
Enter fullscreen mode Exit fullscreen mode

Now we can dive into the function code. As said earlier, this function will be called each time a reconciliation is triggered. As a result, we must be careful about what we are doing here!

For example, if we want to create resources, we must be sure that they are not already existing on the cluster! And if it already exists, we must check it and do some updates if required!

1. Retrieve our custom resource

So the first step is to try to retrieve an instance of our custom resource (here an instance of MyProxy). We need it to get its spec, and being able to update its status.

    // Retrieve the resource
    myProxy := &gatewayv1alpha1.MyProxy{}
    err := r.Get(ctx, req.NamespacedName, myProxy)

    if err != nil {
        // If we have an error and this error said "not found", we ignore the error
        if errors.IsNotFound(err) {
            log.Info("Resource not found. Error ignored as the resource must have been deleted.")
            return ctrl.Result{}, nil

        // If it's another error, we return it
        log.Error(err, "Error while retrieving MyProxy instance")
        return ctrl.Result{}, err
Enter fullscreen mode Exit fullscreen mode
2. Retrieve resources managed by the operator

Now that we have our "parent" resource, we are go retrieve our "child" resources. In our case it's a deployment, its name is defined with the field Name from MyProxy and must be in the namespace test_ns.

found := &appsv1.Deployment{}  
err = r.Get(ctx, types.NamespacedName{Name: myProxy.Spec.Name, Namespace: "test_ns"}, found)
Enter fullscreen mode Exit fullscreen mode
3. Check if the resource exists

The following step consists to check what we get at the previous step. If the variable err is a not found error, we know that the resource doesn't exist, so we can create it! If it contains another error, we return it.

Our implementation will look like this

if err != nil && errors.IsNotFound(err) {  
   // Define a new deployment  
   dep := r.deploymentForExample(myProxy)  
   log.Info("Creating a new Deployment", "Deployment.Namespace", dep.Namespace, "Deployment.Name", dep.Name)  
   err = r.Create(ctx, dep)  
   if err != nil {  
      log.Error(err, "Failed to create new Deployment", "Deployment.Namespace", dep.Namespace, "Deployment.Name", dep.Name)  
      return ctrl.Result{}, err  
   // Deployment created successfully - return and requeue  
   return ctrl.Result{Requeue: true}, nil  
} else if err != nil {  
   log.Error(err, "Failed to get Deployment")  
   return ctrl.Result{}, err  
Enter fullscreen mode Exit fullscreen mode

Here is a really simple example of deploymentForExample

func (r *MyProxyReconciler) deploymentForExample(myproxy *gatewayv1alpha1.MyProxy) *appsv1.Deployment {  
   dep := &appsv1.Deployment{}  

   dep.Namespace = "test_ns"  
   dep.Name = myproxy.Spec.Name  

   var replicas int32 = 2  

   labels := map[string]string{  
      "test_label": myproxy.Spec.Name,  

   dep.Spec = appsv1.DeploymentSpec{  
      Replicas: &replicas,  
      Template: corev1.PodTemplateSpec{  
         Spec: corev1.PodSpec{  
            Containers: []corev1.Container{  
                  Name:  "nginx",  
                  Image: "nginx",  
   dep.Labels = labels  
   dep.Spec.Template.Labels = labels  

   return dep  
Enter fullscreen mode Exit fullscreen mode
4. Update the resource

If we don't get an error while trying to retrieve the resource, it means that we were able to correctly get a resource. So we can check it's parameters and update it if some values has been changed.

var size int32 = 2  
if *found.Spec.Replicas != size {  
   found.Spec.Replicas = &size  
   err = r.Update(ctx, found)  
   if err != nil {  
      log.Error(err, "Failed to update Deployment", "Deployment.Namespace", found.Namespace, "Deployment.Name", found.Name)  
      return ctrl.Result{}, err  
   // Spec updated - return and requeue  
   return ctrl.Result{Requeue: true}, nil  
Enter fullscreen mode Exit fullscreen mode

In our example, we will check that the number of pods is still equal to 2. If it's not the case, we will try to update the resource, and manage the error if we get one.

Update generated files

Once we finished to update our controller, it's really important to execute the following command:

make manifests
Enter fullscreen mode Exit fullscreen mode

We saw earlier that we can find some comments that defines RBAC rights for our controller. So we need to execute this command to (at least) generate RBAC definitions files.

Operator build

Now that our operator is ready to be used, we can build it before deploy it.

Before the build

By default, the built image will be named controller:latest and can be push to As you can imagine, it can generate some issues.

So, if you want to update these informations, you must :

  • update variables IMG and IMAGE_TAG_BASE in the Makefile
  • update the image name in config/manager/manager.yaml


To execute the build, use this command

make docker-build
Enter fullscreen mode Exit fullscreen mode

And this one if you want to push the image to a remote docker registry

make docker-push
Enter fullscreen mode Exit fullscreen mode


To deploy your operator, you must execute 2 commands :

to deploy all your Custom Resource Definitions on your cluster

make install
Enter fullscreen mode Exit fullscreen mode

to deploy your operator

make deploy
Enter fullscreen mode Exit fullscreen mode


When everything above is done, you can try to deploy an instance of MyProxy and you should see a nginx deployment appear!

Example of a MyProxy instance definition

kind: MyProxy  
  labels: myproxy myproxy-sample tmpoperator kustomize tmpoperator  
  name: myproxy-sample  
  name: toto
Enter fullscreen mode Exit fullscreen mode

This part was quite long, but it was necessary to show you how to create an operator and see what we can do with.

In the next part of this serie, we will see advanced configurations and features to help your operator to be more efficient!

I hope it will help you and if you have any questions (there are not dumb questions) or some points are not clear for you, don't hesitate to add your question in the comments or to contact me directly on LinkedIn.

You want to support me?

Buy Me A Coffee

Top comments (0)