This post is part of a series on running Kafka on Kubernetes on Azure. You can find links to other posts in the series here. All code is available in my Github.
In this part, we'll install Kafka to our Kubernetes cluster using Strimzi and try it out.
As discussed in part 1 of this series, running Kafka on Kubernetes widens the choice of deployment platform. Managed Kubernetes is provided by almost every cloud provider, whereas managed Kafka is rarer.
Strimzi makes it quite simple to start up your Kafka cluster. Note that Strimzi is not the only option for running Kafka in Kubernetes; another prominent alternative is Confluent for Kubernetes.
I'll install Strimzi on the Azure Kubernetes Cluster that we set up in the previous part of this series. We'll follow Strimzi's quickstart for installing Strimzi. Select the "Minikube" option; it should work precisely similarly against AKS.
I'll first create a new namespace for our Kafka resources:
kubectl create namespace kafka
With version 0.32.0, Strimzi has introduced a one-line command for installation:
kubectl create \ -f 'https://strimzi.io/install/latest?namespace=kafka' \ -n kafka
The script deploys the Strimzi Cluster Operator, which will run and administer the Kafka resources, and the required Kubernetes users and rights for it to function. In addition, the process installs several Custom Resource Definitions or CRDs. These enable declarative deployment of Kafka resources supported by Strimzi.
For convenience, I'll set
kafka as the default namespace to avoid having to write it out each time:
kubectl config set-context --current --namespace=kafka
You can also check out the CRDs that were installed:
kubectl get crd
The console shows several resources with the word
kafka in them. If interested, you can get further details with
kubectl describe. For example:
kubectl describe crd kafkas.kafka.strimzi.io
Now that is a long one! Lucky that you don't need to implement all that yourself. 😄
Creating a Kafka cluster
With the CRDs created, I can deploy a Kafka cluster using a single resource definition in a YAML. I'll start with the sample YAML provided by Strimzi in their quickstart, linked previously. All scripts used in this post are also available in the series' Github repository.
The YAML is as follows:
There are a lot of possible configurations when setting up the cluster. I'll not go into those in this post; that's a possible topic for the future. 🙂
I'll create the cluster with
kubectl apply -f kafka-cluster.yaml
You can have a look at the Kubernetes Services that this created:
kubectl get service
There are many services with the name of your cluster prefixed; if you used the sample YAML, the prefix is
my-cluster. These services include:
ZooKeeper: ZooKeeper is Apache's general-purpose orchestrator for distributed services, used in Kafka and several other services like Hadoop. Note that you shouldn't need to interact with this directly; it just works in the background. Also, Strimzi is working on removing this dependency to simplify the setup even further.
Kafka Brokers: As discussed in part 1, brokers are the actual worker servers containing all the topics and messages in Kafka. You can think of them as comparable to "nodes" in most other distributed services. We only have one broker in our setup, but we could scale up our cluster by simply increasing the value in
spec.kafka.replicasin the YAML.
Bootstrap: Strimzi simplifies connecting to Kafka by providing a bootstrap service. You only need to provide this service for any client process, and the Kafka protocol will connect to the broker containing your target topic.
Now, you don't see an External IP on any of these services, and you'll need one for connecting to Kafka from outside Kubernetes. For this, you need to add an external listener. This is luckily easy to do - I'll add the following entry to
- name: external port: 9094 type: loadbalancer tls: false
If you now apply the YAML and list your services, you'll see a service with an external IP for your broker and an external bootstrap service. The external bootstrap works the same as the internal bootstrap already added - you can use this as the single entry point for clients.
I now created an external listener of type "LoadBalancer", but there are other types. You can find more information in this series of posts by Strimzi.
WARNING: You'll also see that
tls is set to false. This means that communication to Kafka is not encrypted, so it's highly insecure. I'll return to this topic in the next part of this series, where I configure security for the Kafka cluster.
For now, let's continue with these settings; however, don't send anything sensitive to your Kafka. After testing, you can remove the external listener or stop your AKS cluster to limit exposure.
Following the Strimzi quickstart, you'll find instructions for testing the Kafka cluster from inside Kubernetes. For this post, you can try it out also from outside Kubernetes with the external listener. You can do this do this with the same Docker image used in the quickstart, but with local Docker instead of Kubernetes - so you'll need Docker running.
First, get the external IP of your external bootstrap service, for example,
my-cluster-kafka-external-bootstrap. With this in hand, start the console producer:
docker run -it --rm --name kafka-producer quay.io/strimzi/kafka:0.32.0-kafka-3.3.1 bin/kafka-console-producer.sh --bootstrap-server EXTERNAL_BOOTSTRAP_IP:9094 --topic my-topic
-it connects an interactive terminal to the running container, and
--rm automatically removes the container when you exit the console.
In another terminal, start the consumer:
docker run -it --rm --name kafka-consumer quay.io/strimzi/kafka:0.32.0-kafka-3.3.1 bin/kafka-console-consumer.sh --bootstrap-server EXTERNAL_BOOTSTRAP_IP:9094 --topic my-topic --from-beginning
Write a message in the producer console, and you should see it appear in the consumer console. If so, you now have a functioning Kafka cluster in AKS! 🎉
That's it for this post. Next time we'll look into setting up security for Kafka - see you then!
Top comments (0)