DEV Community

Cover image for Did Kafka Just Get Easier?

Posted on

Did Kafka Just Get Easier?

credit goes: Mayank Ahuja [in/curiouslearner/]

'Apache Kafka Without ZooKeeper - Using KRaft' (Give it a read.)๐Ÿ‘‡

๐’๐จ๐ฆ๐ž ๐๐š๐œ๐ค๐ ๐ซ๐จ๐ฎ๐ง๐ -

โ—พ Apache Kafka is a distributed streaming platform designed for high-throughput, fault-tolerant data pipelines.
โ—พ It enables real-time data processing, event-driven architectures and reliable messaging.
โ—พ Kafka's architecture originally relied on ZooKeeper, an external coordination service.

๐Ÿ“Œ ๐–๐ก๐š๐ญ ๐ฐ๐š๐ฌ ๐™๐จ๐จ๐ค๐ž๐ž๐ฉ๐ž๐ซ'๐ฌ ๐ซ๐จ๐ฅ๐ž?

โ—พ Cluster Metadata Management โœ”

  • Stored information about brokers, topics, partitions and their configurations.
  • Maintained cluster membership and facilitated broker discovery.

โ—พ Controller Functionality โœ”

  • Elected a leader broker (Controller) responsible for managing cluster operations (e.g., partition reassignment, leader election).
  • Relied heavily on ZooKeeper for metadata updates and coordination.

๐Ÿ“Œ ๐‹๐ž๐ญ'๐ฌ ๐š๐ฅ๐ฌ๐จ ๐ญ๐š๐ฅ๐ค ๐š๐›๐จ๐ฎ๐ญ ๐ฌ๐จ๐ฆ๐ž ๐œ๐ก๐š๐ฅ๐ฅ๐ž๐ง๐ ๐ž๐ฌ.

โ—พ External Dependency

  • Required separate deployment and management of ZooKeeper.
  • Increased operational complexity and potential points of failure.

โ—พ Scalability Limitations

  • ZooKeeper could become a bottleneck for large-scale clusters due to metadata management overhead.

โ—พ Operational Overhead

  • Maintaining a ZooKeeper ensemble added administrative burdens.

๐’๐จ ๐Ÿ๐ข๐ง๐š๐ฅ๐ฅ๐ฒ,

Apache Kafka Raft (KRaft), a consensus protocol introduced in KIP-500 to eliminate Kafka's reliance on ZooKeeper.

** KIP-500 => Kafka Improvement Proposal 500

๐Ÿ“Œ ๐‡๐จ๐ฐ ๐ข๐ญ ๐ฐ๐จ๐ซ๐ค๐ฌ? (Kafka with KRaft)

โ—พ With KRaft, Kafka now manages its own metadata through a 'metadata quorum' of brokers.

โ—พ These brokers utilize the Raft consensus protocol to ensure data consistency and availability, removing the need for ZooKeeper.

โ—พ Cluster metadata is stored in a dedicated, internal Kafka topic called '__cluster_metadata'.

โ—พ This topic is replicated across the metadata quorum, ensuring that metadata changes are durable and available even if some brokers fail.

โ—พ The Kafka Controller, responsible for various cluster management tasks like partition reassignment and leader election, is elected as a leader among the metadata quorum brokers.

โ—พ Only the leader Controller can modify the metadata. This ensures that metadata changes are serialized and prevents conflicts.

โ—พ Whenever metadata changes, the leader Controller appends the changes to the internal '__cluster_metadata' topic.

โ—พ Other brokers in the quorum follow the leader's decisions and replicate metadata changes.

โ—พ If the current leader fails, a new leader is elected automatically.

๐–๐ก๐ข๐œ๐ก ๐ฆ๐ž๐š๐ง๐ฌ,

โœ” Simplified architecture.
โœ” Improved scalability.
โœ” Reduced operational overhead.
โœ” Enhanced stability and performance.

๐Ÿ“Œ Support for ZooKeeper was deprecated in Kafka 3.4, encouraging users to migrate to KRaft.

๐Ÿ“Œ ZooKeeper support is expected to be removed entirely in a future Kafka release.

โญ Follow

Top comments (0)