Durable Functions is a great serverless choice for stateful applications. And we can learn how to use queues and tables for loosely coupled stateful system if we want to develop our own system.
Durable Function depends on Durable Tasks framework which you can see source code in GitHub.
Durable Functions Extension
Durable Task
I use Azure Storage as storage provider in this article.
Use Queues to chain functions
Durable Function uses normal Azure Function triggers to start Client function which triggers Orchestration functions and/or Entity functions. This is great design as we can use all supported trigger to integrate with Durable Functions.
Then we can call IDurableOrchestrationClient.Starter
method to invoke Orchestrator function. The method actually doesn't call the function directly but create queue item to control-queue.
See TaskHubQueue.cs: AddMessageAsync which calls storageQueue.AddMessageAsync
to add queue item.
Same mechanism is used to trigger Activity functions and it uses another queue called work item queue.
Use dispatcher to monitor queue
Durable Functions start TaskHubWorker at startup, which starts OrchestrationService. This service has mechanism to monitor queue.
DurableTaskExtension.cs: StartTaskHubWorkerIfNotStartedAsync call
TaskHubWorker.cs: StartAsync which eventually call GetMessagesAsync
to get queue item from queue.
See ControlQueue.cs: GetMessagesAsync for detail code.
Durable Task also has separate class to do the same for Activity Functions and Entity Functions.
Cost and performance
As queue-polling section describes, polling queue incur costs, so you don't want to query often when there is no item in the queue. However, it you need to query often enough to meet performance requirement. Durable Function uses back-off algorithm to balance cost and performance by using UpdateDelay
method.
See BackoffPollingHelper.cs: UpdateDelay for detail algorithm.
Use table to store history
Durable Function uses Azure Storage table to track execution history which contains not only when each function has been planned/executed, but also input/output data.
AzureTableTrackingStore.cs has related methods to insert/update data in storage.
For example, AzureStorageOrchestrationService.cs: CreateTaskOrchestrationAsync firstly get queue item from control-queue, invoke orchestrator function, then log history to table by using trackingStore
.
Durable Functions uses History table and Instances Table to persist information about execution history and latest instance information.
Keep latest information separate from history is useful when you want to query current orchestration status, but you need to keep both table in sync.
Read DurableTask.AzureStorage README to fully understand the details.
Orchestration performance
There are several things you need to know about orchestrator function related to performance.
Reading history table
Durable Functions uses Orchestrator Function as workflow which usually calls multiple Activity Functions to do actual work. As Azure Function is a stateless service, when orchestrator function invoked, it always restarts from first line of the method.
The orchestrator function reads its history information to determine which activities had executed and their output values so that it can continue from previous step.
This means that orchestrator function needs to load entire history every time, which obviously may impact entire performance.
Queue design
Durable Functions explicitly separate queues for orchestration and activities because these functions handles different types of workload. Orchestrator function manage workflow, which should be light-weighted, whereas Activity function runs actual task which could I/O and/or CPU intense codes.
Durable Function uses multiple control queues for control-queues (for orchestrator function) and one work item queue. Number of control queues are defined as partition in configuration. See below diagram which has three partition as an example.
To fully utilize this architecture, you should keep orchestration workflow as simple as possible, and use sub-orchestration when you need to run large workflow, which is same concept as parent-child workflows.
Durable Function also uses Task Hubs as a logical container for functions and storages, but I won't explain it in this article.
Did I cover everything?
Of course NOT. Durable Functions and Durable Task has way more features to support Enterprise scale application and system. I only explain some concepts but I believe it still provides some value. Visit each GitHub repository for more information or read official docs.
Summary
What we can learn from Durable Functions Architecture?
Durable Function architecture is great example of a Cloud Native application. It usually has multiple small components which are loosely coupled. We often use ServiceBus or Storage Queue to connect them, and use Cosmos DB, Storage Table or SQL Server to persist data. If you need to design complex system, I believe Durable Function architecture may give you some hints.
Top comments (0)