DEV Community

Cover image for Install and Maintain Apache Kafka® with a GitOps Approach
Jan Schulte for Outshift By Cisco

Posted on

Install and Maintain Apache Kafka® with a GitOps Approach

Previously, we showed you how you can install and manage Apache Kafka® on Kubernetes with Koperator. While a one-time installation is great, you can unlock more value by automating this process to replicate Kafka deployments as needed for different environments. In this blog post, I will walk you through automating Apache Kafka deployments with ArgoCD.


Before you get started, you’ll need to install the necessary tools:

Provision a new cluster

Next, we’ll use eksctl to provision new clusters. If you don't have eksctl on your machine yet, install it by following these instructions.

eksctl either accepts command-line arguments or a configuration file. Today, we'll use the following parameters in this yaml configuration to setup our cluster:

kind: ClusterConfig

  name: koperator-demo 
  region: us-east-2 
  version: "1.24"

  - name: ng-1
    instanceType: m5.xlarge
    desiredCapacity: 2
    volumeSize: 80
    privateNetworking: true
Enter fullscreen mode Exit fullscreen mode

With the configuration in place, let's create a new cluster:

eksctl create cluster -f workstation.yaml
Enter fullscreen mode Exit fullscreen mode

This step might take a while since it needs to provision a lot of resources.

Once the cluster is ready, you can continue installing ArgoCD. Argo CD is an open-source GitOps continuous delivery tool. It monitors your cluster and your declaratively defined infrastructure stored in a Git repository and resolves differences between the two.

Install ArgoCD

Run the following commands:

kubectl create namespace argocd
kubectl apply -n argocd -f
Enter fullscreen mode Exit fullscreen mode

This will install ArgoCD on your new cluster. Note that ArgoCD can deploy applications within the same cluster or remote clusters. For the sake of scope, we will deploy within the same cluster.

Next you’ll install the ArgoCD CLI:

brew install argocd 
Enter fullscreen mode Exit fullscreen mode

(If you're using a different operating system, please visit this link for install instructions.)

By default, the Argo CD API server is not exposed with an external IP. To access the API server, run the following command:

kubectl patch svc argocd-server -n argocd -p '{"spec": {"type": "LoadBalancer"}}'
Enter fullscreen mode Exit fullscreen mode

Next, you’ll configure access credentials:

argocd admin initial-password -n argocd
Enter fullscreen mode Exit fullscreen mode

We won't expose ArgoCD directly, but rather leverage port forwarding to access the API server and the web interface:

kubectl port-forward svc/argocd-server -n argocd 8080:443
Enter fullscreen mode Exit fullscreen mode

Next, you'll log in with username admin and the password you configured:

argocd login localhost:8080
Enter fullscreen mode Exit fullscreen mode

After you’re logged in, you’ll also need to configure a deployment target. In this case, we'll use the same cluster ArgoCD runs on:

argocd cluster add <your clustername>
Enter fullscreen mode Exit fullscreen mode

Now you’re good to go!

Continuous deployment and GitOps

Per my previous blog post, I showed you how to set up a new Kafka cluster on Kubernetes in minutes. While that process will get you up and running quickly, the real value comes with an automated infrastructure setup. For that, we want to leverage a GitOps-based approach to deploy Apache ZooKeeperTM and Kafka, using ArgoCD.

We will use as a starting point and turn its instructions into Helm Charts, which we will deploy with ArgoCD.

Helm Charts

A Helm Chart allows you to make application installations on Kubernetes reproducible. Usually, to install an application on Kubernetes, you'd run a series of kubectl apply commands to create various resources, such as deployments, pods, services, etc. A Helm Chart bundles these resources and allows us to parameterize them. A Helm Chart can be installed several times within a Kubernetes cluster, under different names, so-called release. We'll use a single repository that will hold two Helm Charts--one for ZooKeeper and one for Koperator/Kafka.


To install and run ZooKeeper, we'll use the ZooKeeper from This operator does the heavy lifting.

The ZooKeeper Helm Chart lists it as dependency in Chart.yaml:

apiVersion: v2
name: -setup
description: Installs  aplication
version: 0.1.0
appVersion: "1.16.0"
  - name: zookeeper-operatorsitory:
    version: "0.2.15"
Enter fullscreen mode Exit fullscreen mode

Next, let's take a look at templates/zookeeper-cluster.yml:

kind: ZookeeperCluster
    name: {{ }} 
    namespace: {{.Values.zookeeper.namespace }} 
    replicas: {{.Values.zookeeper.replicas }}
        reclaimPolicy: {{.Values.zookeeper.reclaimPolicy }}
Enter fullscreen mode Exit fullscreen mode

When executed, this yaml code creates a new ZooKeeper cluster. The template file does not contain specific values, only template variables. Since Helm allows us to install a chart more than once, this yaml code is parameterized, so the user can override values such as the cluster name or namespace.

Helm allows us to provide values in different ways. For now, we stick with a few defaults defined in values.yaml:

# Default values for koperator-setup.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

  replicas: 1
  name: zookeeper
  namespace: zookeeper
  reclaimPolicy: Delete
Enter fullscreen mode Exit fullscreen mode

As you can see here, all variables have sensible default values, allowing you to get up and running without edits. It's possible to override certain values on the command line when executing Helm.

With this Helm chart in place, let's move on to configuring ArgoCD. ArgoCD can deploy from various sources, such as Helm repositories or Git repositories. In this case, we're going with the latter. While we're choosing to create everything on the command line for consiseness, you can also perform the same steps in the web interface.

Let's apply what we have:

argocd app create zookeeper --repo --path zookeeper --dest-namespace zookeeper --dest-server https://kubernetes.default.svc --sync-option Server-Side-Apply=true --sync-option createnamespace=true
argocd app sync zookeeper
Enter fullscreen mode Exit fullscreen mode

First, we create a new application and provide some configuration options, such as the git repository, namespace and server.

Next, we sync the application, so it starts a deployment run. While the first step completes quickly, the second might take a few minutes to complete succesfully. If you want to confirm successful rollout of all resources inspect visit the dashboard.

Now, whenever we make any changes to our ZooKeeper Helm chart, we sync the app and ArgoCD takes care of rolling out the changes automatically.

Koperator and Kafka

Next on our list is Koperator and Kafka. We'll use Koperator to spin up and manage Kafka instances.

This Helm Chart, similar to ZooKeeper, makes use of several Custom Resource Definitions (CRDs). These CRDs will allow us later to:

  • Define and configure new Kafka clusters
  • Define Kafka topics

For Koperator itself, the project's README suggests to install CRDs in a separate step. Therefore, we add all CRDs from to this chart into a separate crd directory. We need to add the CRDs to the Chart.yml:

apiVersion: v2
name: koperator
description: A Helm chart to install Koperator
type: application
version: 0.1.0
appVersion: "1.16.0"
  - name: kafka-operator
    version: "0.24.1"
# CRDs are defined here
  - kafkatopics.crds.yml
  - cruisecontrol.crds.yml
  - kafkauser.crds.yml
  - kafkacluster.crds.yml
Enter fullscreen mode Exit fullscreen mode

We also define our dependency to the Koperator Helm Chart.

When the Helm chart gets installed, Helm will automatically install these CRDs for us.

To get Kafka up and running, the main aspect will be the simplekafkacluster.yml file in the templates directory. This file also comes from the Koperator README. The only change made to it is replacing some of the values with template variables, so we can parameterize it.

Additionally, this chart also contains a template to create one or more Kafka topics:

{{- range .Values.topics }}
kind: KafkaTopic
    name: {{ .name }} 
        name: {{ $ }}
    name: {{ .name }}
    partitions: {{ .partitions }}
    replicationFactor: {{ .replicationFactor }}
{{- end }}
Enter fullscreen mode Exit fullscreen mode

values.yml has everything we need to start creating Kafka clusters:

# Install the Kafka cluster
  name: kafka
    autoCreateTopics: false

#Setup one or more topics
  - name: my-topic
    partitions: 1
    replicationFactor: 1
Enter fullscreen mode Exit fullscreen mode

With this in place, we are all set to create a new Kafka cluster and the first topic:

argocd app create koperator --repo --path koperator --dest-namespace kafka --dest-server https://kubernetes.default.svc --sync-option Server-Side-Apply=true --sync-option validate=false --sync-option createnamespace=true
argocd app sync koperator
Enter fullscreen mode Exit fullscreen mode

What's next?

With this workflow in place, from now on you can declaratively deploy and manage your Kafka cluster. With a few modifications to the Helm charts, you can create as many Kafka cluster as you like or need.

Check out the Helm charts used for these examples here and make sure to also star the koperator repository!

Top comments (0)