There are many reasons to run Apache Kafka on premises, within your own Kubernetes cluster, such as handling sensitive data.
However, regardless of how compelling the reason, there's often a limiting factor—time.
Have you ever thought about running Kafka on premises, within your own Kubernetes cluster, but you're in a time crunch?
In this article, we'll talk about installing and managing Kafka with Koperator—an open-source solution that enables you set up an on-premises message broker in record time—even if you're not an expert.
What’s so hard about deploying Kafka on Kubernetes?
When you deploy a stateless application to Kubernetes, best practices specify that it:
- Should not rely on hard-disk access
- Works well behind a load balancer
- Can scale up and down easily
The more you follow these guidelines, the easier it is to work with Kubernetes. Things start to change, however, once you need to deploy a stateful application or service (like a database or message broker). The latter types of applications have different requirements and often don't fit neatly within the constraints mentioned above. Kafka, for instance, relies heavily on disk access. Also, producers and consumers need to connect to specific brokers. These and other constraints will require you to put in extra effort to build a reliable Kubernetes deployment.
A Kafka development setup might look straightforward. It gives you ready-to-use Docker images—even plug-and-play docker-
compose.yml
files. For a production deployment, however, the to-do list gets longer. Check out this blog post to get a feeling for the scope. Suffice it to say that if you're in a time crunch, sorting through the intricate details of a Kafka production deployment is an exercise in futility. You need a production deployment now. And here's the solution.
A reproducible Kafka deployment on Kubernetes
Instead of getting into the weeds of questions like "should I use a StatefulSet
or a regular deployment
," you can leverage Koperator to make all the decisions for you.
Koperator is an open-source Kafka operator that enables you to set up a production-ready Kafka cluster within minutes, leveraging standard tooling you are (likely) already using. It abstracts away many of the decisions around a Kafka deployment, which saves you a lot of time. You get to run Kafka on your premises, according to your rules and regulations, but with the advantages of a managed service.
Try it out
For the following instructions, we are using Kind—a tool for running local Kubernetes clusters using Docker container nodes. If you haven't installed Kind yet, please follow these instructions.
kind create cluster
kubectl cluster-info --context kind-kind
This will start up a new Kind cluster and set the current context to Kind.
Koperator requires 6 vCPUs and 8GB RAM. If you're using Kind, please make sure to allocate enough resources to your local Docker daemon; otherwise, containers will fail to start.
Please also make sure to install Helm if you haven't already.
Install Apache Zookeeper™
As a first step, we install Zookeeper using Pravega's Zookeeper Operator:
helm install zookeeper-operator --repo https://charts.pravega.io zookeeper-operator --namespace=zookeeper --create-namespace
Next, we create a Zookeeper cluster using custom resources. Custom resources allow us to create a Zookeeper cluster like this:
kubectl create -f - <<EOF
apiVersion: zookeeper.pravega.io/v1beta1
kind: ZookeeperCluster
metadata:
name: zookeeper
namespace: zookeeper
spec:
replicas: 1
persistence:
reclaimPolicy: Delete
EOF
Instead of going through a lengthy deployment and service configuration, we get the service up and running with a few lines of configuration.
Before moving on, let's verify Zookeeper is up and running:
kubectl get pods -n zookeeper
The above command should output something like:
NAME READY STATUS RESTARTS AGE
zookeeper-0 1/1 Running 0 27m
zookeeper-operator-54444dbd9d-2tccj 1/1 Running 0 28m
Install Koperator
Now on to Koperator. We will install it in two steps. First, we'll install the Koperator CustomResourceDefinition
resources. We perform this step separately, to allow you to uninstall and reinstall Koperator without deleting your already installed custom resources.
kubectl create --validate=false -f https://github.com/banzaicloud/koperator/releases/download/v0.24.1/kafka-operator.crds.yaml
Next, install Koperator into the Kafka namespace:
helm install kafka-operator --repo https://kubernetes-charts.banzaicloud.com kafka-operator --namespace=kafka --create-namespace
Create the Kafka cluster using the KafkaCluster custom resource. The quick start uses a minimal custom resource:
kubectl create -n kafka -f https://raw.githubusercontent.com/banzaicloud/koperator/master/config/samples/simplekafkacluster.yaml
Verify that the Kafka cluster has been created:
> kubectl get pods -n kafka
kafka-0-nvx8c 1/1 Running 0 16m
kafka-1-swps9 1/1 Running 0 15m
kafka-2-lppzr 1/1 Running 0 15m
kafka-cruisecontrol-fb659b84b-7cwpn 1/1 Running 0 15m
kafka-operator-operator-8bb75c7fb-7w4lh 2/2 Running 0 17m
Test Kafka cluster
To test the Kafka cluster, let's create a topic and send some messages.
If you have used Kafka before, you might recall the necessary steps to create a topic. Kafka ships with a bunch of utility command-line tools to help with administrative tasks.
While we could that workflow, it'd require a few steps:
- Decide which pod we want to connect to
- Open a shell to the Kafka broker pod
- Find the command-line tools
- Run the
create-topics.sh
tool
If you thought about automating topic creation (i.e. as part of your CI/CD workflow), codifying these steps is possible but cumbersome. Instead, let's use kubectl
and a few lines of YAML configuration:
kubectl create -n kafka -f - <<EOF
apiVersion: kafka.banzaicloud.io/v1alpha1
kind: KafkaTopic
metadata:
name: my-topic
spec:
clusterRef:
name: kafka
name: my-topic
partitions: 1
replicationFactor: 1
config:
"retention.ms": "604800000"
"cleanup.policy": "delete"
EOF
This snippet creates a topic called my-topic
with one partition and a replication factor of 1.
With the topic in place, let's start a producer and consumer to test it. Run the following command to start a simple producer:
kubectl -n kafka run kafka-producer -it --image=ghcr.io/banzaicloud/kafka:2.13-3.1.0 --rm=true --restart=Never -- /opt/kafka/bin/kafka-console-producer.sh --bootstrap-server kafka-headless:29092 --topic my-topic
To receive messages, run the following command:
kubectl -n kafka run kafka-consumer -it --image=ghcr.io/banzaicloud/kafka:2.13-3.1.0 --rm=true --restart=Never -- /opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server kafka-headless:29092 --topic my-topic --from-beginning
So there you have it!
Running Apache Kafka on Kubernetes is possible—even if you're short on time. What's more, it does not require you to become an expert first. Instead, you become an expert while running Kafka in production using Koperator. Koperator abstracts away some aspects of the Kafka deployment and also provides you with a convenient user interface through kubectl
.Check out the GitHub repository and try it for yourself!
Top comments (0)