If you plan to use Kafka in your application, here's everything you
need to know to start with a local developer setup.
Apache Kafka is a powerful tool for stream processing and decoupling
service-to-service communication. Once up and running, leveraging Kafka
in your architecture can significantly improve overall application
performance and reliability.
If you thought about using Kafka but needed help figuring out where to
start, this guide is for you.
Here's everything you'll learn in this guide:
- Kafka Terminology
- How to get started with Kafka on your local developer machine
- How to send test messages
- Useful Command line tools
Here's what you'll need to follow along with this tutorial:
- Docker Compose
Kafka introduces a few new concepts and keywords. We'll cover the
essential concepts here, so you know enough to get started.
By default, Kafka runs in a cluster of several so-called brokers
(each broker is a Kafka instance running on a dedicated machine).
Relying on a cluster instead of a single instance has several
- Increased level of fault tolerance: If a broker goes offline, producers and consumers can communicate to a different machine. Brokers replicate data among instances to prevent data loss.
- Performance: With more than one available broker, not all consumers read from the same instance. Therefore a single broker does not become a bottleneck that easily.
To coordinate brokers, Kafka relies on a tool called Zookeeper. Among
other duties, Zookeeper manages the replication of topics and consumer
When a consumer reads data from a partition, it keeps track of the
latest event it reads with an offset value. This integer value indicates
the last read position. The consumer syncs the offset to Kafka or
Zookeper so that in case of the
consumer crashes, it can quickly recover from its last known read
We don't have to worry too much about Zookeeper for our local
development environment, but it has to be there for Kafka to work.
If you'd like to learn more about Kafka's core concepts, check out this
Apache Kafka is written in Java. The advantage is: It runs everywhere
where Java runs. The downside: You need to have a Java Runtime installed
and configured. Therefore, we'll leverage a Docker Compose setup instead
of downloading and configuring Kafka directly to your system. Your
advantage: You're up and running in no time without having to install
and configure a Java Runtime. In a new directory, create a new file with
docker-compose.yml and paste in the following content:
version: '3' services: zookeeper: image: confluentinc/cp-zookeeper:7.3.0 container_name: zookeeper environment: ZOOKEEPER_CLIENT_PORT: 2181 ZOOKEEPER_TICK_TIME: 2000 broker: image: confluentinc/cp-kafka:7.3.0 container_name: broker ports: - "9092:9092" depends_on: - zookeeper environment: KAFKA_BROKER_ID: 1 KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181' KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_INTERNAL:PLAINTEXT KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092,PLAINTEXT_INTERNAL://broker:29092 KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1 KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1 KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
This YAML snippet configures a single Kafka broker with a Zookeeper
Once you have saved the
docker-compose.yml, run it with:
$ docker-compose up
We'll leave this in the foreground to monitor the log for potential
So, with Kafka up and running, how do we continue?
The next step is a housekeeping item. Before producing and consuming
events, we need to create a topic.
A topic is a logical space that contains events of the same kind or for
a specific use case, such as:
Kafka provides a set of command line utilities to create a new topic.
We'll focus on the essentials for now, so we can start producing
$ docker exec broker \ kafka-topics --bootstrap-server broker:9092 \ --create \ --topic quickstart
We directly run the necessary script inside the broker's Docker
container. When creating a topic, we can configure and customize several
different options. For the sake of simplicity, we create a topic with
With the topic created, let's publish our first message. Run the
$ docker exec --interactive --tty broker \ kafka-console-producer --bootstrap-server broker:9092 \ --topic quickstart
Once the command prompt finished loading, let's type in some text. Each
line starts a new message:
Hello, World! Test Message
The producer publishes the messages to the topic once you hit return.
With the producer in place, let's consume our messages. In a new tab,
run the following command:
$ docker exec --interactive --tty broker \ kafka-console-consumer --bootstrap-server broker:9092 \ --topic quickstart \ --from-beginning
--from-beginning parameter. It instructs the consumer to
read from the beginning of the topic. By default, Kafka holds on to all
messages within a topic. When a consumer has read a message, it saves
its latest offset in Zookeeper. If this particular consumer dies, it can
retrieve the latest offset from Zookeeper and recover from where it left
off reading from the topic.
With the consumer running, we now see the following output:
Hello, World! Test Message
This approach is helpful to get started and get a first feeling for
Kafka and its mechanisms. In your application, however, you'll use the
Kafka API to produce and consume messages.
Before we come to an end here, let's explore one additional helpful
kcat (formerly known as
While the command line tools in the container are helpful, they require
the JVM to be present. In scenarios where no Java Runtime is installed,
or it wouldn't be feasible to install it,
kcat comes in handy. It
connects to a Kafka cluster or broker and allows you to produce and
kcat in producer mode:
$ echo "TEST" | kcat -P -b localhost:9092 -t quickstart
kcat publishes whatever input we pipe into it. For
demonstration purposes, we use a simple
kcat in consumer mode:
$ kcat -C -b localhost:9092 -t quickstart
kcat comes in handy whenever you need to test your application's
producer or consumer side.
Even if Kafka requires some upfront learning, with Tools like Docker and
kcat at our disposal, spinning up a dev instance of a broker becomes
much more manageable. Now that you can run Kafka locally, it's time to
start preparing your application code to publish and subscribe to
If you're thinking about deploying your (first) Kafka-based application
to production, check out Calisti. Calisti enables
you to run Kafka on a Kubernetes cluster without extensive and manual
Also, check out this blog post to learn how you use Kafka client libraries in your code.