glglak

Posted on Feb 12, 2024

Pulumi in Action: Beyond Terraform to Build HA Web Apps on AKS

#infrastructureascode #devops #terraform #pulumi

**Disclaimer: **This article is a bit lengthy and it covers multiple topics as part of my journey to explore IaC tools like Pulumi, so it talks about why's and what's and how's of IaC and Pulumi then you will learn how to provision the infrastructure & deploy a highly available web app on multiple AKS regions using Pulumi!

In 2018, I wrote my first C# program that generates an ARM template for an Azure Application gateway as part of a larger system we were migrating, Everyone was satisfied at first with ARM alone, but quickly, the simplicity turned into complexity as we scaled to 6 environments and over 100 endpoints in my az app gateway alone. Juggling countless complex JSONs and striving for consistency with every change.

After some trials to build C# apps that generate ARMs, I learned I need to properly evaluate both Terraform and Pulumi, I Knew Terraform but not Pulumi! At the time it was pretty new and from the sources online, it was like Terraform but in whatever programming language you wish! yet with many different ways for automating your infrastructure and putting any deliverables aside, why did we end up with having these tools in the first place? Please allow me to take a step back and walk you through a quick recap of Infrastructure as a code first!

What is Infrastructure as Code (IaC)?

IaC is a key practice in DevOps that involves managing and provisioning computing infrastructure through machine-readable definition files.

In IaC, your infrastructure is defined using code, typically in a high-level language or a domain-specific language. This code describes the desired state of your infrastructure, including servers, networks, storage devices, and other resources.

Every cloud provider has their scripting-like language to provision resources.

For Azure, Bicep is the natural evolution of ARM provisioned and maintained by Microsoft, and you can use it to provision Azure resources, a simple Bicep script to provision API gateway with many endpoints would look like the below the snippet which resolves many of my challenges with ARM alone, isn't that sufficient? & that leads to the below big question as well!

resource applicationGateway 'Microsoft.Network/applicationGateways@2020-06-01' = {
  name: 'myAppGateway'
  location: resourceGroup().location
  properties: {
    backendAddressPools: [
      {
        name: 'backendPool'
        properties: {
          backendAddresses: [
            {
              fqdn: 'www.example1.com'
            }
            {
              fqdn: 'www.example2.com'
            }
            // ... add more endpoints ...
          ]
        }
      }
    ]
    // ... other necessary configurations ...
  }
}

We are not Multi-Cloud by design, Could we still need it?

What if this coming project is Azure-based now and in the future, why do we need a multi-cloud tool given we would never need AWS/GCP on that project, and tracking the state of deployments is not always needed?

Yet these tools do not only provide multi-cloud capabilities, they also provide a way to automate your infrastructure. the state management of saving all the history for each deployment such that you can roll back any needed changes is a great feature, and while you can achieve the same using Bicep for example through GIT history, that approach makes you more dependent on your DevOps pipeline.

In addition to that in many scenarios, these infrastructure resources integrate with third-party tools in a dynamic fashion, so let's, for example, imagine a scenario where you are provisioning a web app on Azure, where after each deployment you need to utilize the hostname provided by the app service and register it on cloud flare, Pulumi in the python snippet below does just that easily! Bicep can't compete with that.

import  as pulumi from "@pulumi/pulumi"; 
import  as azure from "@pulumi/azure";
import * as cloudflare from "@pulumi/cloudflare"; // Create an Azure App Service const appService = new azure.appservice.AppService("myAppService", { // ... configuration ... }); 
// Use the output of the Azure App Service to create a DNS record in Cloudflare 
const dnsRecord = new cloudflare.Record
("myDnsRecord", 
{ name: "example", 
  zoneId: cloudflareZoneId, 
  type: "CNAME", 
  value: appService.defaultSiteHostname, });

So maybe if you are happy with the IaC tools idea so far, great news, now comes the next question which tool?!

Terraform vs Pulumi

While Terraform represents itself as a big mature ecosystem for everything multi-cloud-related, Pulumi comes with the power of real programming languages and Pulumi Automation API, and also Pulumi Native integrates with Azure ARM API so the day you get a new feature you would find it supported in the same moment, while Terraform providers would need some time from the community to catch up, but overall there would be some evaluation questions needed, so below are some of them but not all!

Evaluation questions

Do you have a team that manages the pipelines?
Who manages the infrastructure?
Who owns the keys/approvals to the cloud accounts?
How many developers do you have?
Do you need multi-cloud support?
Which learning curve time can you afford?
Do you have any third-party resources that can be provisioned using scripts & you like to provision those workloads based on values/parameters that get generated upon the creation of your cloud infrastructure?
Do you need a very stable and large community for the IaC tool of choice?
Do you have any regulations that enforce unified standards/languages in IaC tools?

& the list goes on to pricing, DevOps team appeals, support, etc, so let's assume you had all these battles and Pulumi was the choice, now what?

Now, Let's start with Pulumi!

Pulumi extends its platform to support various resources/implementations using various methodologies, you can depend on their Kubernetes providers to deal with workloads inside AKS (think helm chart), or you can integrate it as a part of larger devops pipeline on almost any CI/CD platform, you can also utilize their security operators for compliance and policies using their policy-as-code frameworks and developer portals for templates, they also have their gallery of resources that you can utilize to get started easily.

How it Works & Key Concepts

Pulumi depends on the interaction between your programming language and its CLI/Engine in a SAAS fashion, you can sign up for free personal usage, and you can also have other options (Self-hosted Engine) for more restricted envs.

Pulumi Program: Your infrastructure is defined in a Pulumi program, which is a project containing one or more files written in your chosen language that specify the resources you want to deploy.
Pulumi CLI: The Command Line Interface (CLI) is used to deploy and manage your infrastructure. It interacts with the Pulumi service to execute your Pulumi programs.
Pulumi Service/State Management: Pulumi maintains the state of your infrastructure, tracking resource allocations and ensuring that your cloud environment matches the defined state in your code. This state can be stored in the Pulumi Service (SaaS), or self-hosted options like an S3 bucket, Azure Blob Storage, or Google Cloud Storage.
Providers: Pulumi interacts with cloud services through providers. Each provider is a plugin that encapsulates the API interactions with a cloud platform (e.g., AWS, Azure, Google Cloud, Kubernetes) and exposes resources as code.
Resources: In Pulumi, a resource is a component of your infrastructure, such as a compute instance, storage bucket, or database. Resources are defined in your Pulumi program and managed through providers.
Stacks: A stack is an isolated, independently configurable instance of a Pulumi program. You can use stacks to manage different environments (development, staging, production) or different geographic regions with the same codebase.

Demo Time!

Now, let's imagine a hypothetical scenario where we need to achieve the below:

Create two AKS clusters in two different regions (East US, West Europe) using managed identity.

Deploy a simple API to both clusters using helm or a simple deployment script.

Provision Azure Front Door to load balance the global DNS-based traffic and HTTP traffic between two Azure regions with identical AKS services deployed.

The architecture should look like the reference image

Prerequisites

Use any CMD to login into your azure subscription from your local laptop, you'd need also to set the active subscription (ex: az login, az set subscription ='')
Ensure pulumi is installed/running from here.
mkdir/create an empty directory and initiate the setup using pulumi new command which would ask you for an initial stack (let's consider the stack an env but it's more like a way to organize your code across different states) and also if you specifiy the language and cloud provider, it will do the basic scafolding needed for you.
pulumi new azure-csharp
The scaffolding would create an initial C# (the language of choice, but you can use Python, typescript, or any other programming language), the MyStack.cs class would contain the entry point for creating the resources, the pulumi.Dev.yaml would be the variables files simply for the dev env.

Inside the Mystack.cs we will start provisioning the AKS cluster using C#, there are two main providers for this, we are utilizing AzureNative.Container service library, and below is a snapshot for creating two managed clusters in two different regions, please note the code is not ideal as it was meant for demo purposes, but the key points here:

We used C# to provision the infrastructure needed.
We can ensure the creation of two identical clusters using the same code here.
We can create multiple environments easily and reference the values needed using the stack files.

var resourceGroup = new ResourceGroup("aksResourceGroup");
  var pulumiConfig = new Pulumi.Config();

  for (int clusterCount=0;clusterCount<(pulumiConfig.GetInt32("clusterCount") ??2);clusterCount++)
  {
      var cluster = new ManagedCluster(pulumiConfig.Get("clustername") + clusterCount ?? "AksCluster"+clusterCount, 
          new ManagedClusterArgs
      {
          ResourceGroupName = resourceGroup.Name,
          AgentPoolProfiles = new ManagedClusterAgentPoolProfileArgs
          {
              Count = pulumiConfig.GetInt32("nodecount") ?? 3,
              Mode = "System",
              Name = "agentpool",
              OsDiskSizeGB = 30,
              OsType = pulumiConfig.Get("ostype") ?? "Linux",
              VmSize = pulumiConfig.Get("vmsize") ?? "Standard_DS2_v2"
          },
          DnsPrefix = pulumiConfig.Get("clustername") + clusterCount,
          Identity = new ManagedClusterIdentityArgs
          {
              Type = ResourceIdentityType.SystemAssigned,
          },
          Location = clusterCount==0?(pulumiConfig.Get("clusterLocation1")?? "East US") :(pulumiConfig.Get("clusterLocation2")?? "WestEurope")
      });

If everything went okayish up to this point, then all you need is to run

pulumi up
The code will compile and show you the resources that need to be created on that stack like the below image, please note that I had run the code initially first and it happens that I renamed the clusters wrongly the first time, once I fixed it, Pulumi noticed the change, and because changing the AKS name is something that can't change after creation, it will simply delete the old AKS resources and recreate them using the new names.

So far all good! two managed AKS clusters, which I can provision in the morning and destroy at night to save my Azure credit, not too bad as a start!

but one key benefit from utilizing C# to do this task is to be able to deploy my app from the same solution (or even project as it depends on what you are deploying and how you are deploying it, which probably got a lot of whys that needs to be addressed as well before you can decide the best structure)

Now, I need a deployable Helm Chart/service which should showcase a static page that indicates a simple message like "I am on regional USA journey" to be deployed on the first cluster and another static page that indicates the other region "Hello from Europe" or something similar, & I found a reference for that deployable package remotely at GitHub which if we thought about it that this would be our deployed APP into AKS then we will need to do the below steps to integrate it with our Pulumi code :

Get a reference of KubeConfig that would be generated for each cluster after the provisioning. (you can alternatively reference it from a YAML file on disk too)
Use the Pulumi Kubernetes provider to set the kube config for the new provider object.
Utilize Pulumi to deploy your service code into each cluster and expose it via a load balancer to the external world (Obtaining a public IP), while that should be done ideally via Ingress to simplify the setup of the POC, I chose to expose the service via AKS load balancer.

Because the Azure native provider changes from version to version due to updates, I couldn't do it easily from the first trial using online snippets, but given the world now has ChatGPT, I used it to modify my part of the code till I reached the below conclusion to get the KubeConfig after creation and here we go

var kubeconfig = Output.Tuple(resourceGroup.Name, cluster.Name).Apply(names =>
ListManagedClusterUserCredentials.Invoke(new ListManagedClusterUserCredentialsInvokeArgs
{
ResourceGroupName = names.Item1,
ResourceName = names.Item2,
})).Apply(creds => {
var encodedKubeconfig = creds.Kubeconfigs[0].Value;
var decodedKubeconfig = Encoding.UTF8.GetString(Convert.FromBase64String(encodedKubeconfig));
return decodedKubeconfig;
});
Next, I would need to assign the kube config to the Kubernetes provider object, and just create a simple deployment to the service using a YAML script & you can utilize the same code to deploy a helm chart that has a full WordPress for example.

var k8sProvider = new Pulumi.Kubernetes.Provider("k8sprovider" + clusterCount.ToString(), new Pulumi.Kubernetes.ProviderArgs
      {

          KubeConfig = kubeconfig
      }) ;

  // Apply the Kubernetes YAML deployment to the Kubernetes cluster.
  var appDeployment = new Pulumi.Kubernetes.Yaml.ConfigFile("aksAppDemoDeployment" + clusterCount.ToString(),
      new Pulumi.Kubernetes.Yaml.ConfigFileArgs
      {
          File = (clusterCount == 0) ? pulumiConfig.Get("testAppEastUs") : pulumiConfig.Get("testAppWestEurope"),
      }, new ComponentResourceOptions { Provider=k8sProvider });

Next, let's test that, so again all you need to do is to run Pulumi up again and you should see a preview for the changes that would be applied to the article similar to the below image, we can now see that Pulumi detected that we are trying to use AKS config file to create a deployment and service over two AKS clusters.

& afterwards, we should see a service deployed basically into the two clusters similar to the below images, each one running in a separate cluster and indicating where they live just for demo purposes.

& we can test the service by browsing it over the internet via the created IP which we will need later when we create the backend pool for our Azure frontdoor, to ensure both are deployed and you should see the results as below

utilized a yaml that deploys static with region wording of WestUS (in our demo we utilized West Europe but you get the idea)
Afterwards, now we want to provision the Azure Front Door...

Import Code from Other IaC Tools

So as a next step, I've tried building Azure Front Door using the classic Azure provider library in Pulumi and faced some issues I did try as well their Slack channel to ask some questions about why the pointer is not pointing type of questions & test the power of the community, and I've noticed the reply took some precious time, so instead and as a workaround, I thought it could be useful to showcase how can we reverse engineer a situation like this, so basically I went to Azure portal & created the Azure front door service manually, then I grabbed the ARM template and used the online conversion tool because Pulumi allows you to convert templates by Terraform HCL , Kubernetes YAML, and Azure ARM into Pulumi programs. Check it out from here.

After that, I had the code causing me troubles but noticed it relies now on Azure Native Provider and you can also use the latest preview features which showcase the difference between Pulumi and other IaC tools, given one single user mentioned also in the slack channel that I would need to depend on the Azure native CDN library to achieve what I am trying to do, I've also noticed that they have a new game changer tool for help!

Pulumi AI Generator

so that in the preview feature, there is a tool integrated with OpenAI GPT models (3.5, 4, and 3.5turbo supported), that can receive your infrastructure requirements as a text and convert it into one of the supported languages, it is quite still not there, but it did help a 'bit' in resolving some of how to do what questions I got in creating the Azure front door!

& here we go, I've created an origin (auto-created by Azure as a subdomain but we could use a custom domain too) and then two backend pools that should be able to distribute the weight/traffic evenly.

 private static void CreateAzureFrontDoor(ResourceGroup resourceGroup, Pulumi.Config pulumiConfig)
 {

     // Create an Azure Front Door profile
     var profile = new Profile("myFrontDoorProfile", new ProfileArgs
     {
         ResourceGroupName = resourceGroup.Name,
         Location = "Global", // Front Door service is always "Global"
         Sku = new  Pulumi.AzureNative.Cdn.V20230701Preview.Inputs.SkuArgs
         {
             Name = "Standard_AzureFrontDoor"
         }
     });

     // Create an Origin Group with two origins
     var originGroup = new AFDOriginGroup("myOriginGroup", new AFDOriginGroupArgs
     {
         ResourceGroupName = resourceGroup.Name,
         ProfileName = profile.Name,
         LoadBalancingSettings = new  Pulumi.AzureNative.Cdn.V20230701Preview.Inputs.LoadBalancingSettingsParametersArgs
         {
             SampleSize = 4,
             SuccessfulSamplesRequired = 2
         },
         HealthProbeSettings = new HealthProbeParametersArgs
         {
             ProbePath = "/health",
             ProbeProtocol = ProbeProtocol.Https,
             ProbeRequestType = HealthProbeRequestType.GET,
             ProbeIntervalInSeconds = 60
         }
     });

     // Define the origins
     var origin1 = new AFDOrigin("origin1", new Pulumi.AzureNative.Cdn.V20230701Preview.AFDOriginArgs
     {
         ResourceGroupName = resourceGroup.Name,
         ProfileName = profile.Name,
         OriginGroupName = originGroup.Name,
         OriginHostHeader = pulumiConfig.Require("aks1servicelbIPEastUs"),
         HostName = pulumiConfig.Require("aks1servicelbIPEastUs"),
         HttpPort = 80,
         HttpsPort = 443,
         EnabledState = "Enabled",
         Priority = 1,
         Weight = 500,
         EnforceCertificateNameCheck = false

     });

     var origin2 = new AFDOrigin("origin2", new Pulumi.AzureNative.Cdn.V20230701Preview.AFDOriginArgs
     {
         ResourceGroupName = resourceGroup.Name,
         ProfileName = profile.Name,
         OriginGroupName = originGroup.Name,
         OriginHostHeader=pulumiConfig.Require("aks2servicelbIPWestEurope"),
         HostName = pulumiConfig.Require("aks2servicelbIPWestEurope"),
         HttpPort = 80,
         HttpsPort = 443,
         EnabledState = "Enabled",
         Priority = 2,
         Weight = 500,
         EnforceCertificateNameCheck=false
     });

     // Create a Frontend Endpoint
     var frontendEndpoint = new AFDEndpoint("myFrontendEndpoint", new Pulumi.AzureNative.Cdn.V20230701Preview.AFDEndpointArgs
     {
         ResourceGroupName = resourceGroup.Name,
         ProfileName = profile.Name,
         EnabledState = "Enabled",
         Location = pulumiConfig.Get("cluster2location")
         // Specify other properties as required
     }) ;

     // Create a default route for the origin group
     var route = new Route("defaultRoute", new RouteArgs
     {
         ResourceGroupName = resourceGroup.Name,
         ProfileName = profile.Name,
         EndpointName = frontendEndpoint.Name,
         OriginGroup = new Pulumi.AzureNative.Cdn.V20230701Preview.Inputs.ResourceReferenceArgs
         {
             Id = originGroup.Id
         },
         PatternsToMatch = new List<string> { "/*" },
         ForwardingProtocol = ForwardingProtocol.HttpOnly,
         EnabledState = "Enabled",
         LinkToDefaultDomain = "Enabled"
         // Specify other properties as required
     }) ;
 }

& If everything went smoothly, you see now the CLI detected the need to create new resources and it also detected that nothing changed from the previous edits

& Voila, Azure front door now is ready to drive traffic between the two AKS clusters, one last check is to disable 2nd cluster to test that HA is achieved and you can see the result below.

Of course, this is not production-ready as we still need to prevent access for example over our created clusters IPs, but that's just the start, at the end of the day we have a fully managed AKS cluster that is load-balanced evenly across different Azure regions, we also deployed the exact app into each cluster from the same source code, and we can also provision as much envs as we need to ensure we are having consistent workload.

You can also fork the demo repo and utilize it as a starter point for more complicated scenarios.

**What's Next?
**More of Pulumi!!
Please share with me your feedback and your insights & you can follow me on LinkedIn from here.

DEV Community

Pulumi in Action: Beyond Terraform to Build HA Web Apps on AKS

What is Infrastructure as Code (IaC)?

We are not Multi-Cloud by design, Could we still need it?

Terraform vs Pulumi

Evaluation questions

Now, Let's start with Pulumi!

How it Works & Key Concepts

Demo Time!

Prerequisites

Import Code from Other IaC Tools

Pulumi AI Generator

Top comments (0)

Read next

Why Monorepo Projects Sucks: Performance Considerations with Nx

Open Source LLMOps LangSmith Alternatives: LangFuse vs. Lunary.ai

Day 33: Deploying a Three-Tier App on Kubernetes: A Simple Guide

Navigating Amazon Web Services: A Guide to Getting Help through Email