Getting the best of both worlds
Having been involved in several large-scale Kafka projects for different clients across a broad range of industries, I have heard my fair share of questions on Apache Kafka — ranging from the fundamental to the esoteric. One question that never seems to go out of fashion is: How can you maintain strict order, yet still process records in parallel?
And it's a fair question. Strict order assumes linearizability, the very notion of which seems to contradict with the objectives of parallelism.
We will start by exploring the notion of order.
As expected of an event-streaming platform, Kafka preserves the order of published records, providing those records occupy the same partition. In order to understand what this means in practice, one needs to explore the architecture of Kafka topics, and the underlying sharding mechanism — partitions.