Integrations happen more often than not, and reasons range from choosing the best system that fits the purpose, company growth or simply because you have a legacy system to take care of.
I usually start by understanding what data is missing and what reasons make us believe that we need such data.
This helps me to understand the criticality of this data, and many times to re-think if we need that data as part of MVP (Minimum Viable Product) for the next iteration.
Then if the data remains important, I start to think about:
- When Data is Needed?
- Tools Assessment
- Which Team(s) are Involved and Will be Responsible For.
Helps me understand if the target system can store the whole data set. If it doesn’t then better to think of different alternatives (such as data virtualization) and avoid data replication techniques.
Makes me think about the direction of integration because if there are way too many changes, it might be better to have a realtime call, otherwise, on the other end, a daily or weekly batch should handle it.
Helps me understand if the source system is capable of receiving the expected amount of requests. Because many times in legacy systems, that is not possible and alternatives like caching are required.
What is the point in the process that this data is required?
Helps me figure out what is the lead time between the data creation or update and its usage as input.
That is important to balance the importance of updates because many times we can use slightly outdated data without disruption.
It helps me understand if the data is a critical input for the process or useful. That also helps me understand if it can degrade gracefully.
Trust me, the system will become unavailable at some point (downtime, latencies, and so on). So, is important to plan for that, and work around capacity planning and/or introducing techniques such as caching.
In this first part, I’ve covered volumes and when the data is needed. In the next part, I will cover the assessment of tools and teams.