Integration - in a Microservices world, it's one of the hot topics for a business. In order to scale, we've decomposed monoliths into smaller, discrete applications that manage their own data. We're creating event streams and the data we're producing isn't in a single, central database anymore - it's in many different systems, in many different forms. We're also mixing our custom software with SaaS and other applications that give us capabilities without requiring us to build, own and operate all the software we require.
We therefore need technology that allows us to perform integration, giving us the capability to ingest and move data between boundaries, flowing data from one system to another, changing shape and enriching it. We need to turn disparate and distributed data into something we can understand, bringing information back into Data Lakes where we can analyse and produce joined up reports that can give us a single, coherent view over multiple systems.
To do this, there are some very common patterns and steps you'll need to perform - get data from a source system, transform it to a new shape, split it into chunks, some conditional statements depending on state - generate payload and send it to a target system.
Logic Applications Standard are part of the Azure Integration Services suite and allow us to perform complex integrations in Azure. Logic applications are triggered by an event, and will then perform a sequence of actions through to completion - this is considered our workflow.
As the team building the workflow, it's up to you to chain together the actions that perform the work needed - you can build a sequence of decisions and other branches that perform conditional work based on various criteria. You can then orchestrate workflows, having multiple workflows calling each other - a useful approach that allows you to decompose a large business sequence into smaller, easier to understand chunks.
Logic apps standard introduces some important changes, and the runtime is now built as an extension point on top of Azure Functions. This really opens up the hosting possibilities - previously, they were ARM definitions and Azure Only deployments. In Standard, you now have a VS Code extension for designing the flows, can run and test them locally, and can package and deploy them as you would any other piece of software you create. If you containerise, you can host and run wherever you like.
We're going to look at what might be a typical real world use-case for a business.
We want to integrate with a third-party API and perform some data retrieval - We need to import a minimum of 100,000 rows of data. This is a once daily batch operation.
We're going to read API data, transform from XML to JSON, persist some state and raise service bus events.
The third party has a requirement that it must allow-list requests from a static IP.
All PaaS resources must be VNET restricted. We're going to interact with Storage, Service Bus, Key Vault and Cosmos.
We need to do this in approximately 20 minutes, so will set ourselves a NFR of 100/rps throughput. In order to fan out, we need to consider splitting data into chunks and performing updates and notifications in smaller batches than what we initially receive.
Let's look at what these requirements might translate to as a Physical Infrastructure diagram.
To produce a static IP, we need to provision a Public IP address and a NAT Gateway. We then need to route all outbound traffic from our logic application through the VNET that is associated with the NAT gateway. A good article that goes into more detail on this is at note to self - azure functions with a static outbound ip - it's the same process for Logic Apps.
Since resources are VNET restricted, any support access to the PaaS components needs to be considered - in this example we solve the problem with Azure Bastion and an Azure VM in the same VNET. Engineers would need to connect to the VM via Bastion, before accessing whatever resources they require. Within the VNET, we'll therefore create a number of subnets, for the logic workflow and for management operations. Finally, we can take advantage of Service Endpoints for each of the PaaS resources, which fits nicely for this serverless and 100% Azure implementation.
The requirement for VNET configuration forces us to take premium versions of certain Skus, the app service plan to run the logic app and service bus are prime examples of this, where VNET support is only available on the premium offering.
Logic App Standard does allow for a low-code & build approach, but we're going to put some discipline behind this and build a full CI/CD Azure DevOps Pipeline. This allows us to produce a build artifact from a CI job that contains everything we need to provision, deploy and test the application, then use the artifact in a multi-stage pipeline.
To do so, we'll need to generate a zip file with all the required components needed to deploy to azure.
This gives us a fully automated means to provision the physical infrastructure, deploy the workflows and all required dependencies, and test that the workflows are in a good state and doing what we want them to.
One of the advantages of Logic Apps Standard is they allow you to build your own integration connectors and use them within your logic flows. This allows you to produce an nuget package that contains the connector code, and reference that from your applications.
Logic Apps have 'hosted connectors' that you can make use of which will run out-of-process from your logic application. Whilst these are good for a number of use cases, they do come with limitations such as number of requests allowed per minute, or maximum payload sizes. If you have a high throughput application and you're running it on your own infrastructure, it makes sense to use the 'Built In' connector approach which allows you to run completely on your own application plan without any imposed limits (other than that of the underlying infrastructure)
One of the requirements for the demo application for this article was to persist some state in Cosmos Database, and at the time of writing this there was no in-built connector for Cosmos DB. We therefore wrote our own connector, which we've made available as ASOS open source. As well as being a useful connector, it's should demonstrate how to build and test any custom connector you could want to design.
We need to build the infrastructure where we'll deploy and run the logic workflows - if you're working in Azure then you'll have a number of choices for representing physical infrastructure as source code. ARM templates, Azure Bicep, Ansible and Terraform are all options, for this example we're going to use Terraform.
One issue you may encounter with Terraform is that is uses the Azure SDK for Go - it often takes some time for an Azure API to become available in the Go SDK before being implemented as a Terraform module (Indeed, Day 0 support is one of the strengths of Bicep). This means that certain bleeding edge features might not be available, and at the time of writing there is no Terraform resource for Logic Applications Standard.
To solve this problem, we'll wrap up an ARM template the creates the logic app in a Terraform module, allowing us to provision the application in a state ready to be deployed to. Mixing ARM with Terraform is an acceptable approach until a native module is made available. I find the AzureRm provider for Terraform pretty easy to understand and have made a few contributions when a feature hasn't been immediately available to me - it's open source, dive in and build!
All our infrastructure provisioning process should be entirely handled by Terraform code - we'll provision the infrastructure, then zip deploy our application into what we've created. This should be a modular, easily repeatable process that we could get good reuse out of, for any other integrations we care to build.
An important consideration before we begin - how are we going to test the workflows? One of the advantages of Logic Apps is they promote a declarative approach that doesn't require writing code, however - that means we're not going to be using unit testing to help design the application. We need to think about what our test boundaries are within the overall logic app. Let's look at some more traditional test boundaries in a Model -> View -> Controller application
Our requirements for the application are to retrieve data from a third party, transform data, persist into cosmos and transmit events to service bus. We could treat the entire integration as a black box and just test at the boundaries - trigger the processing to start and assert that messages appear on Service Bus (an end to end test). While these have value, they will slow down your feedback loop, make assertions more difficult and are more prone to errors. With data storage and service bus messaging in the mix, you'll need to consider how to isolate data so that concurrent test executions don't interfere with each other.
Ideally, since the workflows are run as an extension on top of Azure functions, I'd create an instance of the process during the build pipeline, mock out all the dependencies and run the whole thing in memory, black box testing in the build pipeline before a deployment takes place. I want to test my application business logic at build time, not necessarily interactions with the underlying transports.
Unfortunately, it isn't possible at the time of writing - improved test support is something that is being addressed by Microsoft as the offering matures. Until that time, we should consider :
- What parts of the workflows can be unit tested? If I'm performing transformations using liquid templates, can I unit test just the transformations work OK?
- If my overall logic application is composed of multiple workflows, can I test individual workflows in isolation without requiring the whole end to end piece? Does that offer value?
- Can I test in the build pipeline, or should we deploy then test? Should we do both?!
In fact, testing the workflows is worth an article by itself, which you can see in Part 3 'Testing Logic Application Workflows'
That's it for this article, which is just an introduction to how we might go about designing the solution and some of the moving parts we need to consider that will allow us to build, deploy and test the applications.
We've turned business requirements into Infrastructure, and made some technology choices to allow us to build and deploy, and we're ready to start the implementation - which we'll look at next. Check back for the follow up articles, which will be on these topics.
- Part 2 - Build Pipelines and Provisioning
- Part 3 - Testing Logic Application Workflows
- Part 4 - Designing for scale out and throughput
- Part 5 - Operating and Observability
Finally, we'll produce the demo application and make that open source, so you can deploy to your own Azure subscriptions.