Christian Lechner

Posted on Apr 2

SAP BTP, Terraform and Open Policy Agent

#sap #btp #terraform #opa

The Challenge

When demoing the Terraform Provider for SAP BTP and talking about the advantages of Infrastructure as Code, one important aspect is that Terraform does not only provision the infrastructure but covers the management of the resources like updating configurations. While this is technically straightforward namely you update and apply the configuration, there are some challenges when doing this in a CI/CD setup.

Let us assume that we have a perfect setup in place: the configurations are stored in a source code management system, and we leverage the a pull-based workflow in the code repository. After a successful review and merge, we have the new configuration in the main branch of the repository and will apply it to the target environment. The application via a CI/CD pipeline from a technical perspective is easy. However, what if we missed a glitch in the new configuration and unfortunately some central resources are getting deleted and created again? Let's say for example an SAP HANA Cloud instance. Oopsie! This would not be received well, I guess. Or rephrasing that in a more graphical way:

How can we handle this? Are there any mechanisms to prevent or at least to some extent safeguard this kind of issues without falling back to a manual workflow? There is. One huge advantage of sticking to (de-facto) standards like Terraform is that first we are probably not the first ones to come up with this question and second there is a huge ecosystem around Terraform that might help us with such challenges.
And for this specific scenario the solution is the Open Policy Agent. Let us take a closer look how the solution could look like.

Note This topic is not specific to the SAP BTP but is an open topic for Terraform in general. While this blog post focuses on SAP BTP, you can easily exchange the examples with other Terraform providers.

The Solution

First some words about the Open Policy Agent. The Open Policy Agent (OPA) is a general-purpose policy engine that enables policy enforcement agnostic of the stack you are using. OPA provides a query language called Rego. Citing the official documentation "[...]Rego queries are assertions on data stored in OPA. These queries can be used to define policies that enumerate instances of data that violate the expected state of the system. [...]"

That sounds like what can help us. Thinking about our Terraform scenario it would be great if we could evaluate the terraform plan result of the new configuration and check if some unexpected things are happening. If some limits that we define in the policy are exceeded (like number of deleted resources of type XYZ), we would stop the automatic processing via the CI/CD pipeline and let a human look at the setup.

Luckily this scenario is also covered by a tutorial on the OPA website that you can find here. The tutorial deals with AWS as an example but making the adjustments for SAP BTP are straightforward. In the following section we will walk through the different components of the solution and see how it works in action.

The Components

You find the complete code for this example in the GitHub repository https://github.com/btp-automation-scenarios/btp-terraform-opa. In the following sections take a closer look at the code of the different solution components.

The Repository

For this showcase we will use a repository on GitHub to store our code (Terraform as well as the OPA Policy). The layout for the repository is as shown in the screenshot:

We have a infra folder that contains our Teraform configuration. The policy folder contains the OPA policy. And the .github folder i.e. the workflows folder therein contains the GitHub Actions configuration for the OPA check run.

In addition, we will use GitHub Actions to run the CI/CD pipeline. We will look at the configuration in the corresponding section.

The Terraform configuration

The basis for the setup is a Terraform configuration that you find in the infra folder (you find the complete code here). We do a very simple setup defined in the main.tf file:

###
# Setup of names in accordance with the company's naming conventions
###
locals {
  project_subaccount_name   = "${var.org_name} | ${var.project_name}: CF - ${var.stage}"
  project_subaccount_domain = lower(replace("${var.org_name}-${var.project_name}-${var.stage}", " ", "-"))
  project_subaccount_cf_org = replace("${var.org_name}_${lower(var.project_name)}-${lower(var.stage)}", " ", "_")
}

###
# Creation of subaccount
###
resource "btp_subaccount" "project" {
  name      = local.project_subaccount_name
  subdomain = local.project_subaccount_domain
  region    = lower(var.region)
  labels = {
    "stage"      = ["${var.stage}"],
    "costcenter" = ["${var.costcenter}"]
  }
  usage = "NOT_USED_FOR_PRODUCTION"
}

###
# Assignment of entitlements
###
resource "btp_subaccount_entitlement" "entitlements" {
  for_each = {
    for index, entitlement in var.entitlements :
    index => entitlement
  }

  subaccount_id = btp_subaccount.project.id
  service_name  = each.value.name
  plan_name     = each.value.plan
}

One subaccount and a list of entitlements are created. The entitlements are defined in the variables.tf file:

variable "entitlements" {
  type = list(object({
    name   = string
    plan   = string
    amount = number
  }))
  description = "List of entitlements for the subaccount."
  default = [
    {
      name   = "alert-notification"
      plan   = "standard"
      amount = null
    },
    {
      name   = "SAPLaunchpad"
      plan   = "standard"
      amount = null
    },
    {
      name   = "hana-cloud"
      plan   = "hana"
      amount = null
    },
    {
      name   = "hana"
      plan   = "hdi-shared"
      amount = null
    },
    {
      name   = "sapappstudio"
      plan   = "standard-edition"
      amount = null
    }
  ]
}

Not a very complex setup, but enough to show the concept. For the sake of this example, we do not even create any resources but will execute the check in the initial setup. Let's move on to the heart of the solution, the OPA policy.

Definition of the Policy

We define the policy in the folder policy as terrraform.rego. According to the rego language we define a package and add some basic import:

package terraform.analysis

import rego.v1

import input as tfplan

This the rego.v1 import is a specific opt-in described here. In addition, we define the import of the Terraform plan (result of the terraform plan command) as input for the further processing.

Next, we define some parameters for the policy:

# acceptable score for automated authorization
blast_radius := 30

# weights assigned for each operation on each resource-type
weights := {
    "btp_subaccount": {"delete": 100, "create": 10, "modify": 1},
    "btp_subaccount_entitlement": {"delete": 10, "create": 1, "modify": 5},
}

# Consider exactly these resource types in calculations
resource_types := {"btp_subaccount", "btp_subaccount_entitlement"}

We define a blast radius that represents an acceptable score for automated execution of the Terraform configuration. To be able to calculate the score we define weights for each Terraform operation and resource type.

Now we must calculate the score for the Terraform plan that we provided as input. For that we define some functions that help us to calculate the number of creations, deletions, and modifications for each resource type:

# list of all resources of a given type
resources[resource_type] := all if {
    some resource_type
    resource_types[resource_type]
    all := [name |
        name := tfplan.resource_changes[_]
        name.type == resource_type
    ]
}

# number of creations of resources of a given type
num_creates[resource_type] := num if {
    some resource_type
    resource_types[resource_type]
    all := resources[resource_type]
    creates := [res | res := all[_]; res.change.actions[_] == "create"]
    num := count(creates)
}

# number of deletions of resources of a given type
num_deletes[resource_type] := num if {
    some resource_type
    resource_types[resource_type]
    all := resources[resource_type]
    deletions := [res | res := all[_]; res.change.actions[_] == "delete"]
    num := count(deletions)
}

# number of modifications to resources of a given type
num_modifies[resource_type] := num if {
    some resource_type
    resource_types[resource_type]
    all := resources[resource_type]
    modifies := [res | res := all[_]; res.change.actions[_] == "update"]
    num := count(modifies)
}

There is quite some rego magic going on that you find in the official documentation. As you see we are using the tfplan input to extract the resources and the operations of the Terraform plan to identify the number of create, update, and delete actions.

Finally, we define the policy itself:

# Authorization holds if score for the plan is acceptable, and no changes are made to IAM
default autoexec := false

autoexec if {
    score < blast_radius
}

# Compute the score for a Terraform plan as the weighted sum of deletions, creations, modifications
score := s if {
    all := [x |
        some resource_type
        crud := weights[resource_type]
        del := crud.delete * num_deletes[resource_type]
        new := crud.create * num_creates[resource_type]
        mod := crud.modify * num_modifies[resource_type]
        x := (del + new) + mod
    ]
    s := sum(all)
}

This snippet calculates the overall score and checks if the score is below the defined blast_radius. If the score is below the threshold the autoexec "decision" is set to true and the Terraform plan could be executed automatically. We also have the score available as a "decision" when evaluating the policy.

You find the code of the policy here.

Integration with the CI/CD workflow

We must bring the bits and pieces together. We use a GitHub Actions workflow to run the OPA check. The configuration is as follows:

name: Evaluate Open Policy Agent for Terraform

on:
  workflow_dispatch:
    inputs:
      PROJECT_NAME:
        description: "Name of the project"
        required: true
        default: "sample-proj-opa"
      REGION:
        description: "Region for the sub account"
        required: true
        default: "eu10"
      COST_CENTER:
        description: "Cost center for the project"
        required: true
        default: "1234567890"
      STAGE:
        description: "Stage for the project"
        required: true
        default: "DEV"
      ORGANIZATION:
        description: "Organization for the project"
        required: true
        default: "B2B"

env:
  PATH_TO_TFSCRIPT: 'infra'

jobs:
  execute_base_setuup:
    name: BTP Subaccount Setup
    runs-on: ubuntu-latest
    steps:
    - name: Check out Git repository
      id: checkout_repo
      uses: actions/checkout@v4

    - name: Setup Terraform
      id : setup_terraform
      uses: hashicorp/setup-terraform@v3
      with:
        terraform_wrapper: false
        terraform_version: latest

    - name: Setup Open Policy Agent
      id: setup_opa
      uses: open-policy-agent/setup-opa@v2
      with:
        version: latest

    - name: Terraform Init
      id: terraform_init
      shell: bash
      run: |
        terraform -chdir=${{ env.PATH_TO_TFSCRIPT }} init -no-color

    - name: Terraform plan 
      id: terraform_plan
      shell: bash
      run: |
        export BTP_USERNAME=${{ secrets.BTP_USERNAME }}
        export BTP_PASSWORD=${{ secrets.BTP_PASSWORD }}
        terraform -chdir=${{ env.PATH_TO_TFSCRIPT }} plan -var globalaccount=${{ secrets.GLOBALACCOUNT }} -var region=${{ github.event.inputs.REGION }} -var project_name=${{ github.event.inputs.PROJECT_NAME }} -var stage=${{ github.event.inputs.STAGE }} -var costcenter=${{ github.event.inputs.COST_CENTER }} -var org_name=${{ github.event.inputs.ORGANIZATION }} -no-color --out tfplan.binary
        terraform -chdir=${{ env.PATH_TO_TFSCRIPT }} show -json tfplan.binary > tfplan.json

    - name: Execute OPA policy
      id: execute_opa
      shell: bash
      run: |
        autoexec=$(opa exec --decision terraform/analysis/autoexec --bundle policy/ tfplan.json | jq '.result[].result')
        score=$(opa exec --decision terraform/analysis/score --bundle policy/ tfplan.json | jq '.result[].result')
        echo "Automatic execution possible (true/false): ${autoexec}"
        echo "Score of change: ${score}"

For the sake of the demo, we use the workflow_dispatch event to trigger the workflow. The user can provide input parameters for the Terraform configuration. We use the predefined Actions hashicorp/setup-terraform@v3 and open-policy-agent/setup-opa@v2 to setup Terraform as well as OPA. Make sure to set the terraform_wrapper to false in the Terraform setup to be able to use the terraform command directly.

After initializing the Terraform configuration via terraform init we run the terraform plan command and store the plan via the -out parameter as tfplan.binary. OPA would not be able to work with the plan in this format, so we must convert it to JSON. We do this via the terraform show -json command and store the plan as tfplan.json.

Now we set the stage for the OPA policy execution. We use the opa exec command to run the policy and provide the Terraform plan as input. To be specific we execute two decisions namely the result of the autoexec and the score. The opa exec would return a JSON object. For the sake of readability, we parse the result via jq and print the result to the console.

That's it. What will the result be? Let's see it in action.

Let's see it in action

The execution of the GitHub Action acting looks like this:

So, the initial application of the configuration would be executed automatically as the score is 15 and below the defined threshold of 30. Let us validate if this is correct:

We have the creation of the subaccount which counts as 10 points. In addition, we create 5 entitlements which count as 5 points. The total score is 15. The code is working as expected. Great!

The Conclusion

Terraform is a great tool for provisioning and managing your infrastructure. Automating the corresponding processes is technically quite easy but the management aspects need some more aspects to be considered. The Open Policy Agent comes in quite handy here as a general-purpose policy engine that can help with overcoming challenges in the automation of the setup. The integration of the tools is quite straightforward as the example of the blog post shows. Not being an expert in the rego language it is obviously quite powerful, but another topic to learn; and I am honest, I was quite happy to have a running example that I could take from the tutorials area of OPA and just make minor adjustments.

While the sample I presented here is very simple, it clearly shows how to deal with the challenge of automating your automated infrastructure management.

With that ... happy Terraforming!

DEV Community

SAP BTP, Terraform and Open Policy Agent

The Challenge

The Solution

The Components

The Repository

The Terraform configuration

Definition of the Policy

Integration with the CI/CD workflow

Let's see it in action

The Conclusion

Top comments (0)

Read next

Programming Problem Solving: C++ Case Study

A Introduction to Understanding Cloud Technology

Tipos brutos e código legado

12 Projects In 12 Months Challenge