This post was originally published on my blog.
In one of my previous posts, I discussed the evolution of application architecture from monolithic to service-oriented to microservices and eventually, to event-driven. I concluded that post with a brief introduction to event-driven architecture (EDA) and promised to dive deeper into the topic in a future post. Well, here we are. In this post, I would like to talk about a type of architecture that you have undoubtedly heard about numerous times in the last few years. Despite having existed for several years, it has suddenly become extremely popular and a necessity for all the companies seeking digital transformation.
However, EDA is nothing new. Anyone who has worked at a financial services company (i.e. investment banks) knows applications have been using EDA for many many years. In fact, if you work at a financial services company, you probably think of EDA as the latest buzzword just like ‘big data‘ a few years ago. But if you were to talk to anyone working in a different industry, such as pharmaceuticals or retail, you would get a very different reaction.
The truth is that most of the world is just realizing the importance of event-driven architecture and the need to be real-time.
The importance of being real-time
I really like Delta Airlines. It has a good loyalty program, comfortable seats, good coverage (domestic and international) and a fantastic app.
Any time there is an important event, I am notified in real-time. Flight is delayed? Notified! Boarding pass is ready? Notified! Baggage checked-in? Notified! My all-time favorite is getting a notification letting me know the carousel number where my luggage will arrive!
This is the power and beauty of being real-time. Legacy applications are batch-based whereas modern applications are real-time.
With the rise of smartphones, consumers demand every application to be real-time. These applications must have all the necessary information in real-time and must be able to push events to consumers instead of consumers asking for updates. For example, only a few years ago, banks would send you monthly credit card statements. If there was a fraudulent transaction, you would have a terrible time trying to convince the bank that $300 bar tab from 25 days ago was not yours. But nowadays, almost every bank, provides you with a real-time credit card statement.
From the bank’s perspective, being real-time allows them to be more reactive to fighting fraud. They are able to now analyze their data in real-time and identify fraudulent activity much sooner than before.
Being event-driven allows corporations to serve their customers better and to be better prepared for any issues. In today’s world, there are very few consumer-oriented corporations that haven’t already transitioned or are thinking of transitioning to using event-driven architecture and becoming real-time.
What are events?
Before we dive further into details, let’s clarify what an event really is.
An event is, simply, an occurrence or change in the state of a system.
This could be an order placed by a customer on an e-commerce site or a change in CPU utilization of a system. Events can be generated by internal or external sources such as a mouse, a keyboard, a thermostat and/or a sensor on your refrigerator.
It’s important to note the difference between events and event notifications. An event notification is simply a message about the occurrence of an event.
What is event-driven architecture?
Now that we know what an event is, how can we define event-driven architecture? EDA is a type of architecture where events are at the center of the system design.
Such a system is critically focused on:
- communicating,
- capturing,
- processing, and
- persisting/replaying events
In an EDA, you have producers and consumers where producers publish events and consumers are responsible for capturing and processing events.
You can have an application or a service that can be both a producer and a consumer.
Benefits of event-driven architecture
There are numerous advantages of using EDA.
Loosely-coupled
EDA allows your applications to be loosely-coupled since your producers have the luxury of publishing events without worrying about how they will be consumed and processed. This means your publishers can be written in Java whereas your consumers might be written in python and c++.
Scaling
Event-driven architecture allows your system to be easily scalable. When you don’t have services directly communicating with each other, you don’t have to worry about writing and managing multiple APIs. This reduces unnecessary complexity and allows your system to scale.
Asynchronous messaging
A critical component of EDA is asynchronous messaging provided by event brokers. Event brokers provide several features such as advanced topic routing
, persistence
, guaranteed ordering
, zero message loss
and replay
that are at the core of what EDA enables you to achieve.
For example, imagine we have a trade order management system (OMS) where whenever an order is placed, you have to notify the client, report the transaction to middle-office, inform compliance and finally, notify the P&L team.
In a non-EDA system, you might use synchronous RESTful APIs to communicate between these different services. They are simple and easy to use, but, at the end of the day, they make synchronous calls which means, as your order
service makes all of those API calls, it has to wait for each request to finish before it can proceed. This is unacceptable because you can’t have your order
service depend on other services such as P&L
service. Imagine not being able to send new orders because your P&L
service was down!
However, in an asynchronous world, life is much simpler. As soon as an order is placed, an order
event is published by your order
service to the appropriate topic (i.e. /NA/US/clientId/portfolio/order
). Any downstream service which is interested in order
events will subscribe to the order
topic and consume related events without blocking the original order
service.
This allows your order
service to continue sending new orders and not have to worry about downstream services.
Smart routing
Event brokers, such as Solace’s PubSub+, allow you to send and route asynchronous events using rich hierarchical topics. You can consume events using the exact topic or wildcards which allow your services to be extremely flexible. For example, a service can subscribe to all the orders in US by subscribing to */US/>
or subscribe to all the events (not just orders) corresponding to a specific portfolio by subscribing to */*/*/portfolio/>
.
Persistence and replay
Additionally, event brokers allow events to be persisted via queues which means if all your downstream services are not online, the events will not be lost forever. As the services will come back online, they will be able to process the events.
Modern event brokers, such as Solace’s PubSub+, are also capable of replay
so that if you have a brand new analytics service interested in old events, it can simply replay older events.
Some challenges with event-driven architecture
There are always some pros and cons to each type of architecture. While EDA can sound tempting, one should carefully evaluate whether their application requires it. You don’t want to over-engineer your small application which only manages a handful of events.
You should also make sure to choose the most appropriate event broker with features that are important to your use-case. If replay
is important to you then don’t pick an event broker which does not support it!
When designing event-driven architecture, it is crucial to spend a good amount of time thinking about topics and queues. You should use rich descriptive hierarchical topics that will be useful down the road.
Finally, events can get out of control easily. You might start with 100 events but soon, as your application grows, they might turn to 10,000 and then quickly 1 million events. How do you map the flow of events in your system? Which events are published by which publishers and which events are consumed by which consumers? And, what’s the schema for each event? If you are familiar with this problem and are looking for a solution, I recommend taking a look at Solace’s PubSub+ Event Portal which is a single place to design, create, discover, share, secure and manage all events within your system.
That’s it for this post. I hope this post provided you with a better understanding of event-driven architecture and why you should think about implementing it in your system!
Top comments (4)
Great write-up. We also recently published an article on how to bridge Backend and Data Engineering teams using Event Driven Architecture - packagemain.tech/p/bridging-backen...
Unfortunately I have to disagree with your previous article at different levels. Back then when we were building monolithic applications deployments were easy. Everything was built, tested and deployed in one go. Errors made solution fail to compile or fail to pass tests. Anyone could run the solution on their PC in debug mode without additional efforts and complex tools (running Linux servers in a docker containers in WSL Hyper-V container on Windows 10). Boy, I wish half of our deployment problems were not caused because one of our developers did not have enough domain knowledge to know his changes break one out of other tens of services the app is composed of. And nearly 90% unit test coverage and PR code reviews are not helping at all. Microservice architecture requires in addition to unit testing also thorough integration testing which usually means secondary identical environment.
Debugging monolithic application is a breeze compared to microservice based solution. Usually running the solution in an IDE of your choice in debug mode is enough. Over the time IDEs and their debugging capabilities have evolved to a tool that does the job perfectly. Comparing this to debugging multiple microservices, stepping in and out of domains, trying to figure out if the problem is not related to out-of-order delivery, required consistency (instead of eventual consistency) or trying to make sense out of DTOs that change form in each microservice is IMHO insane.
Complexity has increased dramatically by stepping towards microservices. That monolithic application that was once running on an expensive powerful server would also run perfectly fine in todays commodity hardware. It's not because microservice architecture makes things less demanding. It's the opposite. You're now probably running docker within a VM in shared hosting datacenter. This adds 2 additional unnecessary layers with unpredictable latencies to your solution. Your services all of a sudden perform worse than usual because some other cloud hosting company customer started heavy invoice pdf generation for their customers and your VM is live migrated to another cluster host. Good luck debugging that!
Back then when you had your own physically rented server and there was a problem you could fire loads of various tools and pinpoint exactly where the problem is. Nowdays you deal with mostly cloud provider issues and unnecessary CPU stealing by other customers on the platform. Forget about decent SQL server performance on local SSD RAID array. Deal with NAS storage with increased unpredictable and heavily varying latencies. You need more Database power? Scale horizontally by deploying sharding or data replication increasing latencies even more! Once you had loads of memory for your main database server that was performing well because you designed your infrastructure for it. Now you probably have database server running in each container either as a data storage, KV store, response cache or message queue store, each running with minimum memory for page cache making it to query already slow NAS storage for data even more. Forget about 20k IOPS you could have with local SSD storage. Your provider said 2k IOPS should be enough for everybody. Should you need more, you can pay as much as a cost of one 1TB enterprise SSD disk each month to have double the IOPS! But don't worry you can always scale horizontally, your company will pay for it! What was once a bunch of resources for future growth in a physical server is now considered minimum to start with simple microservice architecture.
To implement asynchronous behavior your blog suggests Solace's pub/sub. Is there any comparison with other alternatives?
It's tough to compare different solutions without considering the specific problems/requirements you are trying to solve. Solace's PubSub+ is a great solution for enterprises since it comes with all the bells and whistles you would need and in many cases require due to regulatory requirements such as high-availability, Disaster Recovery, proper authentication and authorization etc.
Additionally, there are two types of brokers, smart brokers and dumb brokers. With smart brokers, most of the logic sits with the brokers whereas, with dumb brokers, you have to implement most logic on the publisher and consumer side which means more work and maintenance for developers and middleware teams. Solace's PubSub+ falls in the smart broker category.
Here is some documentation which compares Solace with kafka:
solace.com/kafka/
Hope that helps!