loading...

Explain Apache Kafka like I'm five

realabbas profile image Ali Abbas ・1 min read

I've been lately try to understand Apache Kafka, but I'm little bit unable to comprehend what it is actually. Can anybody explain it like I'm five.

Discussion

markdown guide
 

It's a message broker/"database" (what an overloaded word).
It works by having a "chronologically" ordered queue/stream of entries (eg events).
A key difference from MQTT and other MQs is that rather than entries being "consumed" by a subscriber, or only one last message being available in a "topic", the entire stream can be read from the beginning or some intermediate ID. Eg, someone came and read messages #15-#25, someone else can come later and still read messages #10-#30.
The messages can be cleaned up by expiration or a more complex compaction strategy, so they don't take up infinite space.
Kafka can even be used as "durable storage", for example New York Times allegedly stores all their articles in history in a Kafka stream.
For this, it supports replication between multiple nodes, with "writes" appearing as done for the client only after a certain number of replicas receive them.
For high throughput, the streams can also be sharded, essentially making multiple streams which are each only individually ordered, not among each other.