ericksoen

Posted on Feb 26, 2020

Teaching Terraform from the ground up...

#terraform #aws #infrastructure #operations

Overview

I'm going to write this ...getting started with Terraform guide the way I wished I had learned Terraform. A lot of the tutorials I first used emphasized the declarative definitions for the infrastructure we want to exist and Terraform will figure out how to create it (that's an exact quote). When you understand the abstraction that Terraform manages, those are really powerful tools. But when you're first getting started, opaque AND abstract can be a significant barrier to entry—at least it was for me.

For the duration of this tutorial, we'll go through multiple iterations to create an AWS S3 bucket in the us-east-1 region that complies with business requirements regarding versioning (enabled) and cost-allocation (via tags). We'll also deploy all our infrastructure using the Grant Least Privilege model endorsed by AWS since this will help elucidate some of the interesting ways creating declarative infrastructure intersects with AWS permissions.

By the end of this demo, we'll have created this infrastructure twice using both the AWS CLI and Terraform. The AWS CLI will help ground the abstraction in more straightforward and obvious API calls. In a subsequent step, we use Terraform to create the same infrastructure. If you're a more visual learner, you might find that the AWS Console is an easier tool to manage than the AWS CLI, which is A-OK in my book—just make sure your Console user and Terraform user have the same IAM access levels.

Environment setup

This tutorial assumes some familiarity with AWS and that users already have an account configured if they want to follow along. If you're new to AWS, the introduction to Terraform guide published by Gruntworks has a helpful guide in creating a new account on the AWS free tier.

If you haven't installed the AWS CLI or Terraform previously, go ahead and download them as appropriate.

On my workstation, I'm running the following versions of the each tool:

$ aws --version
aws-cli/1.16.106 Python/3.6.0 Windows/10 botocore/1.12.96

$ terraform --version
Terraform v0.12.20
+ provider.aws v2.49.0

The S3 API used by both tools is well-established and mature so you hopefully won't encounter any issues related to application version. If you do, make sure sure to rule out a version inconsistency as an underlying cause.

User Permission Setup

We'll simplify user creation by performing it exclusively using the the AWS CLI. You can download the IAM user policy that we'll user as our starter set of permissions from a Github snippet. If you're curious what the same policy would like like in Terraform, I've included a second Github snippet with the starter policy.

aws iam create-user --user-name terraform-user

aws iam put-user-policy --user-name terraform-user --policy-name least-privilege --policy-document file://policy.json

We'll be periodically adding more permissions to the policy document and then re-running the aws iam put-user-policy ... CLI command, so make sure to download and keep the file someplace handy.

After creating the user and adding the policy, login to the AWS console, navigate to Identity and Access Management (IAM) and then find the user that you just created. Switch to the Security credentials tab and click Create access key.

Use the following template to add a new profile in your AWS credentials file with the name (just make sure to update the {{name}} template values):

[TfUser]
aws_access_key_id = {{access_key_value}}
aws_secret_access_key = {{secret_access_key_value}}

Creating infrastructure using the AWS CLI

Let's get started and create an S3 bucket using the AWS CLI. Bucket names do need to be globally unique, so you may need to add some extra characters at the end of the bucket name in order to create it successfully.

To simplify things, I'm going to set my bucketName as a variable to keep things consistent (if you're not using a bash terminal, make sure to check what the appropriate syntax is to create and use variables).

bucketName=globally-unique-bucket-name-99999
aws s3api create-bucket --bucket $bucketName --region us-east-1 --profile TfUser

Now that we've created our first AWS resource, we can start work on complying with the second requirement: versioning should be enabled on the bucket. To do that, we can make a different S3 API call via the AWS CLI:

aws s3api put-bucket-versioning --bucket $bucketName --versioning-configuration Status=Enabled --profile TfUser

Of course! We initially get a permissions error since we haven't added s3:PutBucketVersioning to our IAM policy. Once we add that permission to our policy and re-deploy the policy, we are able to enable bucket versioning, allowing us to keep multiple variants of an object in the same bucket.

To comply with the final organizational mandate to add tags to all AWS resources to facilitate cost reporting, we'll go ahead and add a Dept/Engineering key-value pair to our bucket.

aws s3api put-bucket-tagging --bucket $bucketName --tagging TagSet=[{Key=Dept,Value=Engineering}]

You'll most likely see a s3:PutBucketTagging permission error so go ahead and update the policy document with the missing permission and re-deploy.

And you're done! You made three different API calls to create an S3 bucket, enable versioning, and add tags. However, if you need to maintain that infrastructure over time, e.g., adding new bucket tags, as your business scales, making individual API calls won't scale with it.

Terraform

Let's examine what it would take to manage infrastructure for the same S3 bucket using Terraform. We'll deploy infrastructure that satisfies all the same business requirements—go ahead and pick a new S3 bucket name since bucket names are globally unique.

One of the first things to do is to provide credentials for Terraform to use when it invokes the AWS API. In Terraform we do this using providers:

provider "aws" {
    region = "us-east-1"
    profile = "TfUser"
}

We'll add these credentials as well as the infrastructure definitions from later steps in a file named main.tf.

In the above code sample we pass credentials to Terraform using a profile name, but there are a multitude of other ways, e.g., environment variables or passing as variables, that are also available. If we need to interact with more than one AWS accounts or use different permission sets to deploy your infrastructure, you can define multiple provider references, although that's outside the scope of this tutorial, so I'll provide a link to the Terraform guide instead.

We've configured credentials for Terraform to use so it's finally time to create our first piece of declarative infrastructure, which you can do with the code sample below.

resource "aws_s3_bucket" "b" {
    bucket = "globally-unique-bucket-name-999999"
}

After we initialize Terraform (terraform init), review the planned infrastructure changes (terraform plan) and apply the changes, we receive a 403 Forbidden error trying to read the S3 bucket we just created.

Debugging your first error

This is one of those times where the abstraction that Terraform manages can work against users, especially ones just learning the tool. This can be mitigated, however, with a little extra context. The recommended behavior for most Terraform resources it that resource Create and Update functions should return the resource Read function. What this means, practically, is that for every Create or Put permission we provide, we will want a corresponding Get, List or Head permission depending on the vagaries of the AWS API.

And that's exactly what we find if we look at the s3 bucket resource implementation: the Read function invokes the HeadBucket endpoint, which is an IAM permission we have not yet provided to our least-privileged user.

However, even after adding s3:HeadBucket to our permitted actions in our policy.json and re-deploying the policy, we still see the forbidden error (this is also consistent with the behavior of the AWS CLI, aws s3api head-bucket --bucket globally-unique-bucket-name --profile TfUser since the s3:HeadBucket permission oddly also requires s3:ListBucket). Update the permitted actions to include both permissions and re-run terraform apply.

More errors?!?

We're past the initial error, but now we see an even more cryptic error: Error: error getting S3 Bucket CORS configuration: Access Denied: Access Denied. If you recall from our resource declaration, we don't have any reference to COR configuration, so why is it failing? Unlike the AWS CLI, which more often than not has a one-to-one map between API calls and IAM permissions, Terraform resources frequently have a one-to-many map between resources and API calls (it's abstracting away many of the underlying implementation details). For example, the Read function for the S3 bucket resource makes more than 10 calls to different S3 API endpoints.

Rather than list out all the discrete actions in our IAM policy actions, we can add s3:GetBucket* and s3:Get*Configuration, which make uses of the IAM wildcard syntax. To figure out the exact IAM actions to add, you can either add them one at a time as you debug each successive AccessDenied error you receive or sort through the Terraform resource source code (I did the latter and still managed to miss a few).

With the updated policy in place, you should be able to successfully deploy your first infrastructure 🤞. If you still run into issues, I'd love to hear about them in the comments so I can warn future users.

Enable bucket versioning and tagging

The previous sections already teased this behavior, but many Terraform resources make multiple API calls to create the infrastructure for a single resource. That is to say, we don't create one resource for the S3 bucket, a different resource to enable bucket versioning, and a third to add bucket tags. Instead, we define a single resource that handles all three. Let's go ahead and update our Terraform to add bucket versioning.

resource "aws_s3_bucket" "b" {
    bucket = "globally-unique-bucket-name-99999"

    versioning {
        enabled = true
    }
}

Remember that Terraform is managing the abstraction over the API, so if you guessed that part of the Create and Update method behaviors for the S3 resource includes making an API call to the PutBucketVersioning endpoint, you're right! You should already have that permission in place from the CLI demo, but go ahead and add that action to your IAM policy if you don't for some reason.

In the Terraform plan output, we see that we are going to perform an in-place update of the existing resource (certain resource property changes will force you to destroy and re-create resources):

$ terraform apply
Plan: 0 to add, 1 to change, 0 to destroy
...
versioning {
    ~ enabled    = false -> true
    mfa_delete = false
}

With the updated policy in place, you should be able to deploy your infrastructure changes without any errors. Starting to get the hang of it? If not, we'll repeat the same process one more time with tags to reinforce the behavior.

Update your resource definition one more time to include the tags property along with the required value. Similarly, you'll want to make sure your IAM actions include the s3:PutBucketTagging permission before deploying your infrastructure changes.

resource "aws_s3_bucket" "b" {
    bucket = "globally-unique-bucket-name-99999"

    versioning {
        enabled = true
    }

    tags = {
        "Dept" = "Engineering"
    }
}

The plan output again shows that we'll be updating our resource in place to add our bucket tags. Approve the changes to modify your infrastructure one last time.

Terraform State

There's still one key feature of Terraform that most tutorials cover that we haven't discussed: Terraform state files. I'm going to provide a short form version since you can likely defer a more complete understanding until you really need it (when you do, the official Terraform documentation is quite good, as is most of their documentation). For now, think of your state file as a .JSON file that can be stored locally and is responsible for mapping your configuration to resources in the real world.

To demonstrate this, we're going to temporarily suspend bucket versioning on this bucket via the AWS Console. Once you've done that, run terraform plan. The first step in the plan life cycle reads the current configuration of your real world resources via the S3 API: in our case, it returns versioning=false since you just suspended bucket versioning. The next step compares that real world configuration against the infrastructure configuration where versioning=true that you defined using Terraform resources. Any differences between the real world and your resource configuration are displayed in your execution plan output.

Because we can version control our infrastructure configuration files (sometimes referred to as Infrastructure as Code or IaC), this becomes a powerful way to define repeatable, scalable processes to create and manage infrastructure.

Wrap Up

Although S3 bucket resources are one of the cheaper AWS resources to leave lying around, it's always prudent to clean things up and put them away when you're done using them. Run one final Terraform command terraform destroy to remove the resources managed by Terraform. This action will make Delete API calls to the endpoints for tagging, versioning, and the bucket itself, so add the appropriate IAM actions to your permission statement.

There's a lot more complexity that we haven't covered here. We've already alluded to the fact that managing IAM resources like users, roles, and policies are hard (and perhaps moreso using Terraform resource definitions).

We also haven't touched on any of the multi-cloud benefits of Terraform, although as you've seen, the resources you've defined so far are intimately connected to the AWS API. If you wanted to deploy the same serverless function code to AWS, Azure, and GCP, the resource definitions, credentials, and providers you use to manage them will differ substantially.

Finally, we haven't covered Terraform modules, a handy way to create reusable building blocks for your infrastructure, or the Terraform dependency graph.

Those are all beyond the scope of this getting started guide, but if you found this introduction helpful and wanted to learn more about any of those next-level topics, I'd love to hear about it in the comments.

Additional Resources

The following resources were either explicitly referenced in the tutorial above, e.g., the AWS CLI and Terraform resource documentation, or were valuable resources when I first started learning Terraform.

Acknowledgements

Special thanks to Clarissa Sobota and George Brauneis who tested and provided invaluable feedback on some early drafts of this post.

Top comments (2)

Iain Samuel McLean Elder • Feb 3 '21

Hi, Erick, thank you so much for this. I'd like to see more articles like this about Terraform!

Adding s3:HeadBucket alone would not have helped you because the IAM action s3:HeadBucket does not exist :-)

I analyzed the source code of the S3 bucket resource implementation, commit 798ac2f8fad69fe661373d8b4ce1d3117e78cd01.

The function resourceAwsS3BucketRead makes these function calls. They are called directly from the function body except where noted:

HeadBucket (also via GetBucketRegionWithClient)
GetBucketPolicy
GetBucketAcl
GetBucketCors
GetBucketWebsite
GetBucketVersioning
GetBucketLifecycleConfiguration
GetBucketReplication
GetObjectLockConfiguration (via readS3ObjectLockConfiguration)
GetBucketTagging (via S3BucketListTags)

Note that these are Go function names. They correspond to the S3 API methods. The corresponding IAM actions are sometimes named differently.

Most of the API methods use IAM actions with the same name, but there are exceptions:

HeadBucket: s3:ListBucket
GetBucketLifecycleConfiguration: s3:GetLifecycleConfiguration
GetBucketReplication: s3:GetReplicationConfiguration
GetBucketCors: s3:GetBucketCORS (spelling differs only in case, and IAM is insensitive)
GetObjectLockConfiguration: s3:GetBucketObjectLockConfiguration

Adding s3:HeadBucket alone would not have helped you because the IAM action s3:HeadBucket does not exist :-)

You can figure out the mapping between The S3 API methods and the IAM documentation from these pages in the AWS documentation.

S3 API Method list: docs.aws.amazon.com/AmazonS3/lates...
S3 IAM action list: docs.aws.amazon.com/service-author...

It's a shame this information isn't in the Terraform documentation to save us having to reverse engineer it!

Iain Samuel McLean Elder • Feb 3 '21

I've created a new issue for the AWS provider to document these permissions so we don't have to reverse engineer them any more.
github.com/hashicorp/terraform-pro...

DEV Community