Caleb Lemoine

Posted on Jan 23, 2021 • Edited on Jan 30

How to use feature toggles with Terraform

#terraform #devops #tutorial #cloud

Reinventing the wheel for the better

I've seen several articles around this subject including the official Hashicorp doc, but since there wasn't one on dev.to, I took the typical "Well I need to rewrite this whole thing because it sucks" software engineer approach 😄.

What are feature toggles?

Feature toggles allow you to either enable or disable software functionality via a boolean value (true/false).

Why would I use a feature toggle?

The two biggest reasons in my opinion are:

Safety
Have options

By having new functionality toggleable, you can dark launch new features without impacting existing software. I work at a large SaaS company and we promote this practice extensively. By having features that can turned on/off selectively, we rollout experimental features to beta customers without impacting anyone else while using the same codebase. Another element to this is for cost optimization, especially when it comes to infrastructure software like Terraform. Maybe I want to enable load balancing or clustering for customers who pay for it, this way I can offer tiered services and tailor deployments to customers needs.

Diving in

This remainder of the article assumes you have prior knowledge of or experience with Terraform

Here's the example repo we'll be following along with

Let's say we have a module to provision nginx web servers on digitalocean, it would be nice to have a toggle to enable/disable load balancing to control costs in certain environments. What would that look like?

Well, ideally I would like to have a module that has the ability create a load balancer or not like this:

module "web_servers" {
  source                 = "./modules/web_servers"
  instance_count         = var.instance_count
  load_balancing_enabled = var.load_balancing_enabled
}

A load_balancing_enabled flag would be pretty useful to give the consumer options.

How to create the toggle

There's a few key components to load balancing, we need multiple servers and we need a load balancer to be aware of all of the servers provisioned.

Let's create a digitalocean_droplet resource that has a variable for how many instances to create and a load balancer with all of the instances behind it.

resource "digitalocean_droplet" "web" {
  count     = var.instance_count
  name      = "web-${count.index}"
  size      = var.instance_size
  image     = var.image
  region    = var.region
  user_data = var.user_data == "" ? file("${path.module}/files/cloud-init.yaml") : var.user_data
}

resource "digitalocean_loadbalancer" "public" {
  count  = var.load_balancing_enabled ? 1 : 0
  name   = "web-servers-loadbalancer"
  region = var.region

  forwarding_rule {
    entry_port     = 80
    entry_protocol = "http"

    target_port     = 80
    target_protocol = "http"
  }

  healthcheck {
    port     = 22
    protocol = "tcp"
  }

  droplet_ids = digitalocean_droplet.web[*].id
}

Count

When we tell the terraform module that we have multiple instances, how does it name them? Terraform supports a cool meta-argument called count. A meta-argument is simply a language feature that can be applied to any Terraform resource independent of the provider(DO, AWS, GCP, Azure, etc). The count meta-argument will create the resources as if it were performing a for loop over an array with the number of count as the iterator.

resource "digitalocean_droplet" "web" {
  count = var.instance_count
  name  = "web-${count.index}"

So if instance_count were 2, 2 resources(servers) would be created and named like so:

web-0
web-1

Any time the count meta-argument is supplied, Terraform will store these resources as an array in Terraform state named like so:

digitalocean_droplet.web[0] or digitalocean_droplet.web.0
digitalocean_droplet.web[1] or digitalocean_droplet.web.1

This is an important concept when it comes to feature toggling in Terraform because if we want to selectively turn things off and on, we need use the count meta-argument on everything so that we can set it to either 1 or 0, e.g. create a thing or not.

Let's look at the next use of count:

resource "digitalocean_loadbalancer" "public" {
  count  = var.load_balancing_enabled ? 1 : 0

Here we are using a ternary operator as a conditional expression in Terraform. The above code reads as:

if (load_balancing_enabled) {
    create()
} else {
    doNothing()
}

Creating the resources

Let's play with some of these parameters and see how Terraform responds.

To follow along, you'll need to have a digitalocean account to create an API token.

# Set digitalocean token for authentication
export DIGITALOCEAN_TOKEN=XXXXX
# Clone the repo
git clone https://github.com/circa10a/terraform-feature-toggle-example.git/
# Change directory into the repo
cd terraform-feature-toggle-example/
# Install web_servers module and digitalocean provider
terraform init

If you look at the files in the repo, we have a default.auto.tfvars file which makes it easy to change configurations.

Here's the default:

instance_count = 1
load_balancing_enabled = false

This will create 1 droplet and no load balancer. Here's the output of terraform apply:

❯ terraform apply -auto-approve

module.web_servers.digitalocean_droplet.web[0]: Creating...
module.web_servers.digitalocean_droplet.web[0]: Still creating... [10s elapsed]
module.web_servers.digitalocean_droplet.web[0]: Still creating... [20s elapsed]
module.web_servers.digitalocean_droplet.web[0]: Still creating... [30s elapsed]
module.web_servers.digitalocean_droplet.web[0]: Creation complete after 34s [id=227920620]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

Outputs:

droplet_ips = [
  "167.172.28.220",
]
load_balancer_ip = ""

Outputs

Now we have Terraform making decisions about what to create using our toggle variable. We told our web_servers module to create 1 instance and no load balancer, let's change that and enable load balancing by modifying our default.auto.tfvars file and have load_balancing_enabled = true:

instance_count = 1
load_balancing_enabled = true

And now run terraform apply again:

❯ terraform apply -auto-approve

module.web_servers.digitalocean_droplet.web[0]: Refreshing state... [id=227920620]
module.web_servers.digitalocean_loadbalancer.public[0]: Creating...
module.web_servers.digitalocean_loadbalancer.public[0]: Still creating... [10s elapsed]
module.web_servers.digitalocean_loadbalancer.public[0]: Still creating... [20s elapsed]
module.web_servers.digitalocean_loadbalancer.public[0]: Still creating... [30s elapsed]
module.web_servers.digitalocean_loadbalancer.public[0]: Still creating... [40s elapsed]
module.web_servers.digitalocean_loadbalancer.public[0]: Still creating... [50s elapsed]
module.web_servers.digitalocean_loadbalancer.public[0]: Still creating... [1m0s elapsed]
module.web_servers.digitalocean_loadbalancer.public[0]: Still creating... [1m10s elapsed]
module.web_servers.digitalocean_loadbalancer.public[0]: Creation complete after 1m19s [id=fc36d0bf-12e5-4d7c-a9c2-06f3859588c5]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

Outputs:

droplet_ips = [
  "167.172.28.220",
]
load_balancer_ip = "144.126.248.10"

Awesome! 🎉 🎉 🎉

Our load balancer is now being created because we set load_balancing_enabled to true, but wait, outputs are different.

When load_balancing_enabled was false:

Outputs:

droplet_ips = [
  "167.172.28.220",
]
load_balancer_ip = ""

And when load_balancing_enabled was true:

Changes to Outputs:
Outputs:

droplet_ips = [
  "167.172.28.220",
]
load_balancer_ip = "144.126.248.10"

We now have a load_balancer_ip output. Well that's because of 2 reasons.

1. If we look at modules/web_server/outputs.tf

output "droplet_ips" {
  value = digitalocean_droplet.web[*].ipv4_address
}

output "load_balancer_ip" {
  value = var.load_balancing_enabled ? digitalocean_loadbalancer.public[0].ip : ""
}

Our module is conditionally outputting the load balancer's ip based on var.load_balancing_enabled variable. If it's true, give the value of digitalocean_loadbalancer.public[0].ip else "".

Back to count

So why does the output resource value have that 0? If you recall our overview of the count meta-argument, any resource that has count set will output its resources as an array, so in this case we're forced to use 0 to reference the first (and only) ip attribute from the digitalocean_loadbalancer.public resource.

2. Our primary outputs.tf in the root of the project, outputs the values above from the module like so:

output "droplet_ips" {
  value = module.web_servers.droplet_ips
}

output "load_balancer_ip" {
  value = module.web_servers.load_balancer_ip
}

Recap

Thanks to count and ternary operators in Terraform, we can make module configuration in Terraform pretty intuitive.

Let's not forget about all the other added benefits of feature toggling in Terraform::

Decouple deploy from release
Enable customization for operators as well as consumers
Save costs on resources you may or may not need

Don't forget to run terraform destroy to remove the resources we created! Otherwise you'll see the costs on your bill!

References

GitHub Repo

Additional Resources

Spacelift has an excellent article for more Terraform language features and their usage. Check out Terraform Functions, Expressions, and Loops by Spacelift!

DEV Community