Looking back at Kafka Summit 2022 London

#kafkasummit #apachekafka #conference

Read on to find out about the best bits of this year's London Kafka Summit, back as an in-person event.

Kafka Summit 2022 London: a recap

The event that everyone in the Apache Kafka® community was waiting for was back! Kafka Summit 2022 has been a blast with over 1500 people joining the O2 location in east London for a couple of days of workshops, talks and networking. Aiven was present with a booth and a couple of talks by our Senior Software Engineer Olena Babenko and myself, Francesco Tisiot, Developer Advocate.

The Keynote: Apache Kafka® and the modern data flow

The keynote by Jay Kreps covered how Apache Kafka fits the modern data flow, touching interesting points about it being streaming (who wants to go back to batch?), decentralised, declarative (hi Apache Flink® SQL!), developer oriented, and governed and observable. It's nice to see how different companies are evolving over the same concepts in different ways, and we at Aiven are no exception.

We've been always big fans of Streaming, the times of night batches are gone and nowadays with Apache Kafka, Kafka Connect with a growing set of connectors and MirrorMaker 2 developers have all they need to move their data in streaming mode.

When talking about Decentralization Jay correctly mentioned that the times of unique data pipelines within a company are gone: data is flowing across departments and taking different shapes. It's important to give developers the freedom of choosing the best data platform to fulfil their needs and to enable a fast and reliable connection between services. Need a relational database? pick PostgreSQL® or MySQL. A time series database? There is M3. Search engines? Why not try OpenSearch®! How to connect services? Pick between linear data pipelines with Kafka Connect or add transformation capabilities with Apache Flink®.

The Declarative concept was spot on in my opinion: developers should aim to write code that tells what they want to achieve and not how. SQL is the perfect language for this, enabling a wide range of developers to describe the shape of the output data without having to worry about what's happening in the back. We are enabling the SQL interface in Aiven for Apache Flink® to allow data practitioners to define streaming pipelines in the language they love.

The fourth point Jay mentioned is that tools should be Developer oriented and therefore meet developers where they are. This is basically a slogan for Open Source Software that developers can embrace without licensing limits and integrate with their favourite tools. Even more, building managed services and pre-packaged integrations enables developers to spend less time on plumbing allowing them to focus on building.

The last concept covered during the keynote was about data assets being Governed & observable: in this era of ever growing data, it's easy to lose control over the assets. Therefore developers should have tools to keep an eye on their data landscape, this is where prebuilt integrations to internal and external services and an accessible console can make the difference. We're cooking more on this, so stay tuned!

Breakout sessions

The inner beauty of tech conferences lies in the talks and Kafka Summit is no different! An amazing variety of topics ranging from how Apache Kafka is used in Space to the details on how to contribute to Apache Kafka gave attendees plenty of options to dig more into the Kafka world.

We attended several sessions and enjoyed particularly:

Keep Your Cache Always Fresh with Debezium! by Gunnar Morling going over using Kafka and the Debezium connector to keep a cache fresh, with a detailed example based on Infinispan. It's incredible how Gunnar manages to include soooooo much information in 45 minutes, definitely a session to watch, rewatch and rewatch again to grasp all the bits!
Bringing Kafka Without Zookeeper Into Production by Colin McCabe covering a topic we'll see more and more in the future with Zookeper-less Kafka.
Practical Pipelines: A Houseplant Soil Alerting System with ksqlDB by Danica Fine showing a real example of a streaming application to save plants' lives. From the plumbing of IoT devices and a Raspberry PI, to the alerting threshold definition and the streaming calculations, Danica showed a complete example of an end-to-end streaming data pipeline.

As Aiven we did our part in the show with Olena Babenko sharing several tips on how to optimise Apache Flink applications over Kafka. Olena covered important topics like when to choose Flink vs Kafka connectors, how to manage imbalanced topics and how to understand if Flink is behaving well.

Yours truly was also speaking with a talk aimed at understanding the limits of the JDBC connector and how the Debezium connector can save the day sharing the real story of a friend called Mario.

Our booth was very busy during the event with several people being interested in knowing what Aiven is about and our story on Apache Kafka. Also, looks like our socks and stickers rock, crabby FTW!

Community

The magic fact happened towards the end of each session. Questions, interactions, one-to-ones, networking: all the things we missed from in-person conferences were back enabling people to interact with each other and feel the sense of community that online events can hardly deliver. I had the pleasure of meeting IRL people that I've been interacting with for years on social media, and make connections with new interesting humans.

Meeting people with similar interests (can go from Kafka to anything else) can create strong and lasting bridges!

Future of Kafka Summit

What will be the future? As originally written by Franz Kafka, “things sometimes change shape”. This also applies to Kafka Summit: the next conference is going to be called "Current" and open to a broader set of technologies in the data streaming world! The Call for Papers is already open, and we are really looking forward to it!