Apache Kafka is a well-adopted event streaming platform. Its use cases around stream processing and event-driven programming have become more popular and influential over the last few years. If you're brand-new to Kafka and plan to use it in an upcoming project, this guide is for you.
We will cover the essentials you need to know, tools and technologies around Kafka, and how to run it in production.
Today's applications must solve vastly different use cases than a decade ago. Think about the ridesharing app you're using. On a map, you'd like to see frequent location updates of the driver or your friend. Or, you need to capture user events from mobile and web clients. While the processing might happen a bit later, you must first ingest the sheer amount of data. Whatever the specific use case you're trying to solve,
Kafka often comes up as a core building block.
Apache Kafka comes with its own set of terminology and concepts. Before you start using it, it's crucial to understand these terms. Below is a list of things you need to know to get started.
In Kafka terminology, producers send data to Kafka. On the receiving end, consumers read data from a specific topic. At first sight, a consumer is similar to a consumer in a message-queue context. Having more than one consumer read from a given topic is possible. In cases where it's crucial only to have a message processed once, however, we can add consumers to a so-called consumer group. Then, within this group, only one consumer will process a message.
Learn more about consumers and consumer groups
A topic is a named logical channel between producers and consumers. A Kafka cluster can hold arbitrarily many different topics. For instance, a topic might be
user_events, containing client event data. Another topic might store payment-processed information. Topics are the primary
way to organize data into different logical groups. Learn about topics in-depth here.
Another aspect of writing code that works with Kafka is
partitioning. Partitions are Kafka's tools to manage data processing parallelism. Read more here
to learn about the partition mechanism.
Are you ready to build your first Kafka-based application? Check out these blog posts to help you get started:
Every application's success depends on a well-working local development environment. Not including language specifics, this blog post walks you through the few steps it takes to get Kafka up and running on your machine.
Let's build the first application once you have Kafka up and running. In this blog post, you'll learn the fundamentals needed for building both a producer and consumer.
Ready for the next step? Then let's talk about building a producer that knows how to partition data. Click here to start building.
In a growing application landscape with many services, something you'd like to avoid at all costs is a scenario where a malformed message causes a cascade of errors with consumers. Right early on, if you can enforce a uniform message format across all producers and consumers, there's one set of problems less to worry about. Kafka's Schema Registry
helps you to enforce message formats consistently.
In this blog post,
you learn how to setup Schema Registry and how to incorporate related concepts into your producers and consumers.
Running Apache Kafka in production can look different based on your requirements. Unlike a development cluster, a production-ready Kafka deployment must meet fault tolerance, permissions, and backup requirements. Besides managed hosting, here are some resources to consider.
Sometimes, you must run Kafka behind your firewall on your
infrastructure. In such cases, to get started, check out this
guide to get started with installing Kafka on Kubernetes. While it's not covering every possible aspect, it gets you started quickly.
Compared to a manual installation, this guide takes a slightly different approach. Instead of writing every YAML configuration by hand, the method shown here allows you to hand off some of the responsibilities and decisions to an operator. Click here to learn how you can set up your Kafka cluster with Koperator.
Many Kafka deployments are part of a microservice architecture. Besides installing and managing Kafka, it's also crucial for you to ensure all services are healthy and compliant. That's where Calisti comes in.
Calisti combines the aspects of managing Kafka while making your microservice architecture observable. Sign up here for a free account.