Data Orchestration is becoming one of those buzzwords sales people like to throw around in pitch meetings.
It is very important to understand that data orchestration is not just one "thing", but describes a whole process of how to handle data within a system.
I identified 4 steps each data orchestration platform should support:
Data has to get into the system one way or another. We all know how it goes: part of the data we need is on an SQL database somewhere, other data comes in via MQTT and the last part is only available in .txt logs. Data ingestion can be a mess.
The first thing a Data Orchestration platform should be able to do is integrate with all these different sources so it is able to combine the data we need.
Data is rarely in the right format when entering a system. Therefore we need a way for data to be combined, unpacked or decoded. This step could also entail aggregation if we don't want to use all the data but just a higher order abstraction.
Now that the data is ready, it is time to make some decisions based on what we ingested. For this, the platform needs some way to create rules and automated workflows. The cool platforms even have the ability to utilize AI or ML models and let those call the shots.
Lastly it is time to take action! Based on the data and earlier decisions, the platform should be able to take actions in and outside of its own ecosystem. This could go from throwing alarms to turning some physical device on/off or sending quarterly figures to a certain list of employees.
The best of the best data orchestration platforms are flexible in each of these steps, making it a generally applicable solution. They let the developer integrate the system easily with other platforms and provides a way to quickly define new logic. But that is a topic for a whole other article.
Want an example of a data orchestration platform that does these things right? Check out Waylay IO.