DEV Community

Cover image for Introduction to AWS S3 Remote Backend with Terraform
Panchanan Panigrahi
Panchanan Panigrahi

Posted on

Introduction to AWS S3 Remote Backend with Terraform

Infrastructure as Code (IaC) is revolutionizing how we manage and provision infrastructure, and Terraform has become a standout tool in this landscape. However, a critical aspect of using Terraform effectively is managing the Terraform state file—the single source of truth that records your infrastructure’s current status.

In this guide, we’ll explore the importance of the Terraform state file, why local storage can be risky, and how using AWS S3 as a remote backend offers a scalable, secure, and collaborative solution.

What is the Terraform State File?

The Terraform state file, terraform.tfstate, is an essential JSON file that maintains the mapping between your Terraform configuration and the actual resources in your infrastructure. It contains:

  • Resource IDs: Unique identifiers of resources.
  • Attributes: Resource properties (e.g., IP addresses, configurations).
  • Dependencies: Relationships between resources, ensuring proper order during changes.

This file enables Terraform to manage your infrastructure efficiently, understanding what exists and what needs updating, creating, or deleting.

Why Is the State File So Important?

  1. Single Source of Truth: It provides Terraform with a clear and consistent view of your infrastructure.
  2. Change Detection: When running terraform plan or terraform apply, Terraform checks this state file to understand what has changed.
  3. Dependency Management: Terraform uses the state file to maintain resource dependencies, ensuring changes happen in the correct sequence.

Without an accurate state file, Terraform wouldn’t know how to manage your infrastructure, leading to potential inconsistencies and misconfigurations.


Backends for Storing Terraform State

Terraform offers two main ways to store the state file:

  1. Local Backend: Stores the state file on your local machine or a shared file system.
  2. Remote Backend: Stores the state file on a remote service, such as AWS S3, ensuring centralized access and better collaboration.

Let's explore why using the remote backend is usually a better choice.


Limitations of Local Backend

The local backend, while simple, introduces several challenges:

  1. Collaboration Issues: Storing the state file locally means only one person can access it at a time, hindering team collaboration.
  2. Risk of Data Loss: If your local machine crashes or the state file is accidentally deleted, you lose your infrastructure’s state, leading to potential rework and errors.
  3. No State Locking: When multiple users try to modify the state file simultaneously, conflicts arise, causing corruption or inconsistencies.
  4. Manual Backups: Regular backups are critical, but manually handling them is error-prone and inconvenient.
  5. Security Concerns: Storing sensitive data on your local machine lacks the robust security controls that remote storage offers.
  6. Scalability Issues: As your infrastructure grows, managing and maintaining larger state files becomes inefficient.

Because of these limitations, the local backend isn’t ideal for production or collaborative environments.


Why Not Store State Files in Version Control Systems (VCS)?

It might seem convenient to use Git or another VCS for tracking state files, but this practice introduces serious problems:

  1. Sensitive Data Exposure: The state file often contains sensitive information, such as access keys or credentials, which can be compromised if stored in a VCS.
  2. Concurrency Issues: Git doesn’t offer a locking mechanism, making it easy for multiple users to create conflicting changes to the state file.
  3. Inefficient Handling: State files can be large and frequently change, making them a poor fit for version control tracking.

For these reasons, using a dedicated remote backend is a more secure and efficient option.


The Benefits of Using AWS S3 as a Remote Backend

AWS S3 is one of the most popular choices for Terraform's remote backend due to its:

  1. Reliability: AWS S3 ensures high availability and durability for your state file.
  2. Access Control: You can use AWS Identity and Access Management (IAM) to control access to the state file.
  3. Versioning and Recovery: S3’s versioning capability allows you to recover previous versions of your state file if needed.
  4. State Locking: By integrating with AWS DynamoDB, you can enable state locking to prevent simultaneous modifications.
  5. Scalability and Security: AWS S3 scales effortlessly and offers server-side encryption, making it ideal for managing your infrastructure state securely.

Step-by-Step Guide to Setting Up AWS S3 as a Remote Backend

We'll configure the S3 backend in two steps:

  1. Step 1: Create and configure an S3 bucket and a DynamoDB table using a local backend.

    • This ensures the necessary infrastructure (S3 for state storage and DynamoDB for state locking) is set up before migrating, providing a secure and reliable environment.
  2. Step 2: Migrate to using the S3 backend in our Terraform configuration.

    • This step transfers state management to the S3 bucket, enabling centralized state storage and collaboration, along with state locking through DynamoDB.

Directory Structure

Terraform-AWS-S3-Backend/
├── global/
│   └── state.tf
└── README.md
Enter fullscreen mode Exit fullscreen mode

Feel free to check out my GitHub repository: Terraform-AWS-S3-Backend. It contains the complete code and configurations for setting up an AWS S3 backend with Terraform. If you encounter any difficulties or have questions, this resource may help you troubleshoot or understand the process better!


Step 1: Create and Configure the S3 Bucket and DynamoDB Table Locally

In this first step, we define the configuration to set up an S3 bucket and DynamoDB table locally. Here's the state.tf file:

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.56"
    }
  }
}

provider "aws" {
  region = "us-east-1"
}

resource "aws_s3_bucket" "terraform_state_bucket" {
  bucket        = "panchanandevops-tf-state"
  force_destroy = true

  lifecycle {
    prevent_destroy = true
  }
}

resource "aws_s3_bucket_versioning" "terraform_bucket_versioning" {
  bucket = aws_s3_bucket.terraform_state_bucket.id

  versioning_configuration {
    status = "Enabled"
  }
}

resource "aws_s3_bucket_server_side_encryption_configuration" "terraform_state_crypto_conf" {
  bucket = aws_s3_bucket.terraform_state_bucket.id

  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm = "AES256"
    }
  }
}

resource "aws_s3_bucket_public_access_block" "terraform_state" {
  bucket                  = aws_s3_bucket.terraform_state_bucket.id
  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

resource "aws_dynamodb_table" "terraform_state_lock_table" {
  name         = "terraform-state-lock-table"
  billing_mode = "PAY_PER_REQUEST"
  hash_key     = "LockID"

  attribute {
    name = "LockID"
    type = "S"
  }
}
Enter fullscreen mode Exit fullscreen mode

Explanation of the Terraform Configuration

1. Main S3 Bucket Configuration: aws_s3_bucket

Purpose:

The aws_s3_bucket resource creates the S3 bucket that will store your Terraform state file. This is the core resource that defines the actual storage location for your state.

Benefits

  • Bucket Name (bucket):

    • bucket = "panchanandevops-tf-state" defines the name of the S3 bucket that will store your Terraform state file. This unique bucket name (panchanandevops-tf-state) ensures there is no naming conflict, as bucket names must be globally unique across all AWS accounts.
  • Force Destroy (force_destroy):

    • force_destroy = true enables the bucket to be deleted even if it contains objects. This setting can be useful if you want to quickly remove the bucket during the development or testing phase, as it allows deletion without manually clearing out objects.
  • Lifecycle Configuration (lifecycle):

    • prevent_destroy = true adds an additional layer of protection by preventing accidental deletion of the bucket through Terraform. This means that any attempt to destroy the bucket using terraform destroy will result in an error, ensuring your state data remains intact unless this setting is explicitly changed.

Code Reference:

resource "aws_s3_bucket" "terraform_state_bucket" {
  bucket        = "panchanandevops-tf-state"
  force_destroy = true

  lifecycle {
    prevent_destroy = true
  }
}
Enter fullscreen mode Exit fullscreen mode

2. Versioning: aws_s3_bucket_versioning

  • Purpose: This resource enables versioning for the S3 bucket, meaning that every time the Terraform state file changes, a new version is stored instead of overwriting the existing one.
  • Benefit:
    • Accidental Deletion Recovery: If you accidentally delete or corrupt the state file, you can easily restore a previous version.
    • Tracking Changes: Allows you to track changes made to the state file over time, providing a history of your infrastructure's evolution.

Code Reference:

resource "aws_s3_bucket_versioning" "terraform_bucket_versioning" {
  bucket = aws_s3_bucket.terraform_state_bucket.id

  versioning_configuration {
    status = "Enabled"
  }
}
Enter fullscreen mode Exit fullscreen mode
  • Here, the status = "Enabled" line activates versioning, ensuring that every modification to the state file results in a new version being saved.

3. Encryption: aws_s3_bucket_server_side_encryption_configuration

  • Purpose: This resource configures server-side encryption for the S3 bucket using the AES256 encryption algorithm. It ensures that all data stored in the bucket, including your Terraform state file, is encrypted at rest.
  • Benefit:
    • Data Security: Protects sensitive data in your state file (such as passwords, keys, and configurations) from unauthorized access.
    • Compliance: Helps meet security and compliance requirements, as many organizations mandate encryption for data stored in the cloud.

Code Reference:

resource "aws_s3_bucket_server_side_encryption_configuration" "terraform_state_crypto_conf" {
  bucket = aws_s3_bucket.terraform_state_bucket.id

  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm = "AES256"
    }
  }
}
Enter fullscreen mode Exit fullscreen mode
  • The sse_algorithm = "AES256" ensures that the Advanced Encryption Standard (AES) with a 256-bit key is used for encryption. This is a widely accepted, strong encryption standard.

4. Access Control: aws_s3_bucket_public_access_block

  • Purpose: This resource blocks public access to your S3 bucket, ensuring that the Terraform state file is not exposed to the internet.
  • Benefit:
    • Enhanced Security: Prevents unauthorized access by making sure that no one can view or modify your state file from outside your AWS account.
    • Data Privacy: Ensures sensitive infrastructure details remain private, preventing leaks or breaches.

Code Reference:

resource "aws_s3_bucket_public_access_block" "terraform_state" {
  bucket                  = aws_s3_bucket.terraform_state_bucket.id
  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}
Enter fullscreen mode Exit fullscreen mode
  • Explanation of Each Parameter:
    • block_public_acls = true: Prevents any public access control lists (ACLs) from being applied to the bucket.
    • block_public_policy = true: Ensures that the bucket policy does not allow public access.
    • ignore_public_acls = true: Ignores any public ACLs that may be attached to the bucket.
    • restrict_public_buckets = true: Ensures that no public policies can be applied to this bucket, fully restricting access.

Together, these settings make it impossible for your bucket to be publicly accessible, ensuring maximum security.


5. State Locking: aws_dynamodb_table

  • Purpose: The DynamoDB table is used for state locking, which prevents multiple users or processes from making changes to the state file at the same time.
  • Benefit:
    • Prevents Conflicts: State locking ensures that only one terraform apply or terraform plan command can run at a time, avoiding race conditions and potential corruption of the state file.
    • Collaborative Safety: Enables multiple team members to work on Terraform configurations simultaneously without overwriting each other's changes.

Code Reference:

resource "aws_dynamodb_table" "terraform_state_lock_table" {
  name         = "terraform-state-lock-table"
  billing_mode = "PAY_PER_REQUEST"
  hash_key     = "LockID"

  attribute {
    name = "LockID"
    type = "S"
  }
}
Enter fullscreen mode Exit fullscreen mode
  • Explanation of Each Component:
    • name = "terraform-state-lock-table": Specifies the name of the DynamoDB table.
    • billing_mode = "PAY_PER_REQUEST": Charges based on the actual usage of the table, making it cost-effective since the state locking feature doesn’t require constant access.
    • hash_key = "LockID": Defines LockID as the primary key for the table. This key is used to lock the state file during Terraform operations.
    • attribute block: Declares the LockID attribute of type S (string), which is necessary for identifying the lock record.

By integrating this DynamoDB table, you ensure that your state file remains consistent and protected, even in a collaborative environment.

Initializing and Applying Terraform Infrastructure Locally

  1. Initialize Terraform:

    • Run terraform init in your project's root directory to initialize the Terraform configuration. This sets up the necessary plugins and prepares your working directory.
  2. Plan Infrastructure:

    • Execute terraform plan to preview the changes Terraform will make. This step ensures that your configuration is correct before applying any changes.
  3. Apply Infrastructure:

    • If the plan looks good, run terraform apply to create the infrastructure. When prompted, type yes to confirm and apply the changes.

With this, Step 1 is complete, and your infrastructure is successfully created locally. Next, we'll move to Step 2.


Step 2: Add the S3 Backend Configuration

After creating the bucket and DynamoDB table, update your Terraform configuration to include the remote backend settings:

terraform {

  backend "s3" {
    bucket         = "panchanandevops-tf-state" 
    key            = "global/s3/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-state-lock-table"
    encrypt        = true
  }

}

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.56"  
    }
  }
}

provider "aws" {
  region = "us-east-1"
}

resource "aws_s3_bucket" "terraform_state_bucket" {
  bucket        = "panchanandevops-tf-state"
  force_destroy = true

  lifecycle {
    prevent_destroy = true
  }
}

resource "aws_s3_bucket_versioning" "terraform_bucket_versioning" {
  bucket = aws_s3_bucket.terraform_state_bucket.id

  versioning_configuration {
    status = "Enabled"
  }
}

resource "aws_s3_bucket_server_side_encryption_configuration" "terraform_state_crypto_conf" {
  bucket                  = aws_s3_bucket.terraform_state_bucket.id

  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm = "AES256"
    }
  }
}

resource "aws_s3_bucket_public_access_block" "terraform_state" {
  bucket                  = aws_s3_bucket.terraform_state_bucket.id
  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}


resource "aws_dynamodb_table" "terraform_state_lock_table" {
  name         = "terraform-state-lock-table"
  billing_mode = "PAY_PER_REQUEST"
  hash_key     = "LockID"
  attribute {
    name = "LockID"
    type = "S"
  }
}
Enter fullscreen mode Exit fullscreen mode

Explanation of the Backend Block:

terraform {

  backend "s3" {
    bucket         = "panchanandevops-tf-state" 
    key            = "global/s3/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-state-lock-table"
    encrypt        = true
  }

}
Enter fullscreen mode Exit fullscreen mode
  • bucket: Name of the S3 bucket to store the state file.
  • key: Path to the state file within the bucket.
  • region: AWS region where the bucket is located.
  • dynamodb_table: Table used for state locking.
  • encrypt: Ensures that the state file is encrypted in transit and at rest.

Migrating the State File from Local to Remote S3 Bucket

In Step 1, we configured all the necessary settings for our S3 backend. Now, it's time to migrate our state file from the local machine to the S3 bucket.

  1. Initiate Migration:

    • Run terraform init -migrate-state. This command reinitializes Terraform and detects that you want to move the state file to the remote S3 backend.
  2. Confirm Migration:

    • When prompted, type yes to confirm the migration. This action will transfer your local state file to the configured S3 bucket, ensuring that your infrastructure's state is now managed remotely.

AWS S3 Bucket

By completing this step, you have successfully migrated your state management to a secure and centralized S3 backend.

AWS S3 Bucket

Check AWS S3 Bucket and our Statefile through UI

S3 bucket

AWS S3 Bucket

terraform.statefile

AWS S3 Bucket


Cleanup Our Infrastructure

To effectively clean up our infrastructure after migrating to the S3 backend, we need to follow a reverse process of the steps we took during the setup. This ensures that we remove the resources without leaving any orphaned components.

Remove the S3 Backend Configuration

  1. Edit Configuration:

    • Open your Terraform configuration file and remove the block that defines the S3 backend. This will prevent Terraform from attempting to manage the state in the S3 bucket.
  2. Reinitialize Terraform:

    • Run the command terraform init -migrate-state. This command will inform Terraform to reinitialize the working directory and migrate any remaining state files back to your local machine.

Adjust S3 Bucket Lifecycle Settings

  1. Modify Lifecycle Block:
    • Change the lifecycle setting in the aws_s3_bucket resource. Update the prevent_destroy attribute from true to false. This adjustment allows the bucket to be destroyed when you run the terraform destroy command.

Your updated aws_s3_bucket resource should look like this:

   resource "aws_s3_bucket" "terraform_state_bucket" {
     bucket        = "panchanandevops-tf-state"
     force_destroy = true

     lifecycle {
       prevent_destroy = false
     }
   }
Enter fullscreen mode Exit fullscreen mode

Destroy the Infrastructure

  1. Execute Destruction:
    • Now that the configuration is ready, run terraform destroy -auto-approve. This command will initiate the destruction of all resources managed by Terraform, including the S3 bucket and any other associated infrastructure. The -auto-approve flag skips the confirmation prompt, allowing the destruction to proceed without additional input.

By following these steps, you ensure that all resources are cleaned up properly without any leftover components in your AWS account.

Check AWS S3 Bucket, dynamodb table though UI

S3 bucket

AWS S3 Bucket

Dynamodb table

AWS S3 Bucket


Conclusion

Using AWS S3 as a Terraform remote backend offers a robust, secure, and scalable solution for managing your infrastructure state files. By combining S3 with DynamoDB for state locking, you ensure a reliable and collaborative environment that is ideal for production-grade infrastructure.

This guide demonstrated step-by-step instructions on how to set up AWS S3 as a remote backend, emphasizing why it's a superior alternative to local storage or version control systems. Adopting a remote backend like AWS S3 is a crucial step in building resilient and scalable infrastructure with Terraform.

Top comments (2)

Collapse
 
whimsicalbison profile image
Jack

Thanks for writing this article! I quickly scanned it to see how you approached the "chicken and egg" problem when managing infrastructure with Terraform—specifically, the challenge of needing an S3 bucket and DynamoDB table to manage state, but also wanting everything in Terraform from the start.

I know there are solutions like using Terragrunt, and at one company, we used CloudFormation to set up these initial resources. Another possible approach is to manually create the S3 bucket and DynamoDB table, then import them into Terraform afterward. Seems like most companies do what you detailed in your article where they just have the S3 bucket and DynamoDB table outside of Terraform however

Collapse
 
sre_panchanan profile image
Panchanan Panigrahi

Thank you for your comment! You're absolutely right about the different ways to tackle the "chicken and egg" problem with Terraform state management. Many teams opt to set up the S3 bucket and DynamoDB table outside of Terraform or use tools like Terragrunt, but I prefer having everything managed within Terraform from the start for consistency and full control.

And yes, I’m a big fan of Terragrunt as well! I plan to write a blog on how Terragrunt approaches this issue, offering a different perspective on solving the "chicken and egg" challenge. Stay tuned!

Some comments may only be visible to logged-in visitors. Sign in to view all comments. Some comments have been hidden by the post's author - find out more