Yes you read it correct. Currently, Kafka uses ZooKeeper to store its metadata about partitions and brokers, and to elect a broker to be the Kafka Controller. This controller pushes out state change notifications to other brokers in the cluster, it is possible for brokers to miss some of the changes even after several re-tries. This can leave brokers in a divergent state. The state in ZooKeeper often doesn’t match the state that is held in memory in the controller. And so, this metadata may not be in synchronized state in ZooKeeper and Kafka controller.
To overcome such challenges, Kafka is soon going to be evolved and will remove its dependency on ZooKeeper. With the introduction of KIP-500, Kafka’s metadata will be stored on Kafka itself rather than in an external system ZooKeeper. This is just like we are using “Kafka on Kafka” and will remove its dependency on its keeper. Also, it will simplify the deployment and configuration of Kafka along with improvement in scalability. Metadata scalability is a key part of scaling Kafka in the future. It is expected that a single Kafka cluster will eventually be able to support a million partitions or more.
To know more about this, please refer below links:
Confluent link: https://www.confluent.io/blog/removing-zookeeper-dependency-in-kafka/
Apache Wiki: https://cwiki.apache.org/confluence/display/KAFKA/KIP-500%3A+Replace+ZooKeeper+with+a+Self-Managed+Metadata+Quorumhttps://cwiki.apache.org/confluence/display/KAFKA/KIP-500%3A+Replace+ZooKeeper+with+a+Self-Managed+Metadata+Quorum
Top comments (0)