Large organisations will typically have many, sometimes hundreds of Azure subscriptions. The challenge here is to secure and manage without slowing down the agility the cloud offers. What's need is a process to be able to stamp out subscriptions with security and monitoring baked in using devops pipelines.
If you have an Azure problem, if no one else can help, and if you can engage them, maybe you can hire a dev crew (after the A-Team)
Our dev crew were recently tasked to make this a reality using our twin philosophies of "engineering excellence" and "code with". Code with means we work alongside our customers in a joint team pair programming against an agreed backlog. Engineering excellence is all about details:
- use of pipelines to deploy any code
- use PR with at least to reviewers
- solid documentation
- testing rigor in this case with Terratest and Go .. and many more!
We had the following design goals:
- if the rules change we wanted the revised rules to be applied to all subscriptions not just new ones
- Everything can be deployed from a pipeline
- A mechanism to test the rules work by applying them to a test subscription
- Default logging to be enabled for all resources
- A central management subscription with
- a key vault for tenant-based secrets
- a central container registry
- a central key vault for tenant based secrets about any of the managed subscriptions
- Security Center to show alerts across the tenant
- Each managed subscription would just have a small set of default resources:
- key vault
- log storage
- custom roles
- a budget
We used several repos for this:
- Tenant with the terraform to create the management subscription
- Subscription with the terraform to create a subscription
- Policies all policies were described in Terraform in a standard way and put in here.
- Custom Roles. Azure Custom roles described in Terraform are stored here
- Docs. We wanted to control the approval of designs and architecture so we used a PR process to control docs going into the main branch of this repo.
We used a number of pipelines in the project:
- CI/CD to create the management subscription
- CI to create a managed subscription for testing
- CD to create a managed subscription in live which would be called from a Jira ticket via an Azure function
- CI/CD for applying policy changes to the Management Group
We meet the customer where they are and use whatever tooling they are comfortable the only constraint being we only work on Azure projects. In this case we did our CI/CD in Azure devops but in such a way that all the flow control was handled in bash to make it as easy as possible to port pipelines to other tooling. The customer was already using Terraform for IaC.
The key to securing Azure is Policies. These can be grouped into Blueprints (if you're using ARM) or Initiatives which work for Terraform - for example we created one Policy initiative for the CIS benchmarks.Subscriptions can also be grouped into Management Groups both of which make all of this a lot simpler.
We also allowed users to create Policy Exceptions for example a contributor might want to create a VM without an encrypted disk, but if they do this then it'll be flagged in Security Center and they'll have to justify this to their security team.
We took 10 sprints to get this done and documented and then took two weeks out to share what we learnt more widely. For sharing we have contributed to the Terraform AzureRM provider, to the Cloud Adoption Framework and have written up the more interesting parts of this project:
- Contributing to the Azure Terraform Provider
- Test your Azure policies in parallel
- Terraform module for custom Azure policies
- Auto-start collection of Azure diagnostic telemetry
- Terraform Bootstrap & Backend State in Azure
- Standardize resource names in Terraform scripts
- Bypassing policies in Azure
- Calling Azure APIs with the REST Extension for VS Code