DEV Community

Cover image for Introduction to Infrastructure as Code with Terraform and Packer
Donald Sebastian Leung
Donald Sebastian Leung

Posted on

Introduction to Infrastructure as Code with Terraform and Packer

Assets for this article can be found on GitHub.

In this post, we will learn what is Infrastructure as Code, followed by a hands-on session configuring AWS EC2 instances with Terraform and Packer.

Prerequisites

It is assumed that you:

  • Are comfortable with the Linux command line and troubleshooting
  • Have an AWS account
  • Familiar with basic AWS services such as Amazon EC2
  • Familiar with the basics of AWS CLI
  • Have AWS CLI configured with an IAM account with sufficient permissions to create and manage the resources involved in this tutorial
  • Are fully aware that following this tutorial may incur monetary costs and you are solely responsible for any such costs

An understanding of core cloud computing concepts (IaaS, PaaS, SaaS etc.) and the underlying technology (hypervisors, hardware virtualization, OS virtualization etc.) would be helpful. You may follow this tutorial with another cloud provider of your choice, in which case the key concepts presented in this article still apply, but note that non-trivial modifications to the instructions outlined in this article may be required.

With the prerequisites addressed, let's get started!

Infrastructure as Code (IaC)

Reference: Introduction to Infrastructure as Code with Terraform

Infrastructure as Code, often abbreviated IaC, refers to a declarative approach of managing infrastructure, such as physical or virtual machines, containers, cloud instances, etc., by describing the desired components and state in configuration files, which are then passed to an automation tool or suite of tools to be applied to the target infrastructure.

"Declarative" means that the configuration files describe what we want the infrastructure to look like, rather than how to achieve it. When we then apply these configuration files to our infrastructure using automated tool(s), it is the tool's responsibility to figure out how to achieve the desired state, such that the DevOps practitioner(s) do not have to worry about such details.

Another benefit of describing the desired state of infrastructure in configuration files is the ability to apply techniques on them such as version control typically used for software, i.e. code, which was not previously possible, hence the name IaC.

Terraform

Reference: Introduction to Infrastructure as Code with Terraform

Terraform is an open source IaC solution by HashiCorp licensed under the Mozilla Public License (MPL) 2.0 which works with multiple cloud platforms, as well as popular tools such as Docker and Kubernetes. So let's get started!

Installing Terraform

Reference: Install Terraform

In this and subsequent sections, we'll assume you are working in a Linux environment. If not, you may have to adapt the instructions accordingly.

The latest Terraform release at the time of writing is v1.1.2. Download the compressed archive containing the Terraform binary for the latest version, unzip and delete the archive:

$ wget https://releases.hashicorp.com/terraform/1.1.2/terraform_1.1.2_linux_amd64.zip
$ unzip terraform_1.1.2_linux_amd64.zip
$ rm terraform_1.1.2_linux_amd64.zip
Enter fullscreen mode Exit fullscreen mode

If on ARM, replace amd64 above with arm64 instead.

Now move the binary somewhere in your PATH, e.g. /usr/local/bin:

$ sudo mv terraform /usr/local/bin
Enter fullscreen mode Exit fullscreen mode

Confirm it is properly installed by querying the version:

$ terraform --version
Terraform v1.1.2
on linux_amd64
Enter fullscreen mode Exit fullscreen mode

If you see output similar to the above, you're good to go.

Print usage instructions:

$ terraform --help
Usage: terraform [global options] <subcommand> [args]

The available commands for execution are listed below.
The primary workflow commands are given first, followed by
less common or more advanced commands.
...
Enter fullscreen mode Exit fullscreen mode

Creating an EC2 instance and associated resources with Terraform, and connecting to it

Reference: AWS Provider

Remember in our last tutorial where we manually created an EC2 instance plus associated resources by executing AWS CLI commands and connected to it? We're going to do the same thing here, except this time we'll do it declaratively with Terraform.

Here, I have my AWS CLI configured to use an IAM administrator, and my default region is us-east-1. If your configuration is different from mine, e.g. you use a different default region, you might not be able to use the exact same AMI as I do so adapt the instructions accordingly. Remember that you can discover AMIs in your region by running the aws ec2 describe-images command with appropriate filters.

First create a dedicated directory ec2-basic and cd into it:

$ mkdir ec2-basic
$ cd ec2-basic
Enter fullscreen mode Exit fullscreen mode

Terraform configuration files use their own format known as HashiCorp Configuration Language (HCL) and the file extension is .tf. Create a Terraform configuration file main.tf with the following contents:

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 3.0"
    }
  }
}

provider "aws" {
  region = "us-east-1"
}
Enter fullscreen mode Exit fullscreen mode

How does Terraform know how to work with multiple cloud providers and open-source platforms? It does so through providers, which could be provided by:

  • HashiCorp themselves (official)
  • Verified HashiCorp partners such as Amazon (verified)
  • The larger Terraform community (community)

Some older providers could also be archived due to various reasons.

In the file above, we specify in the terraform { ... } block that we need to install the aws provider from hashicorp/aws with an appropriate version. The next block provider "aws" { ... } contains configuration for our AWS provider. Here, we specify the region to use: us-east-1.

Now initialize our project with terraform init:

$ terraform init
Enter fullscreen mode Exit fullscreen mode

If successful, one of the lines should read:

Terraform has been successfully initialized!
Enter fullscreen mode Exit fullscreen mode

Now save the following in a new file default-vpc.tf:

resource "aws_default_vpc" "default" {}
Enter fullscreen mode Exit fullscreen mode

This just specifies that Terraform should adopt the default VPC resource for management, where the name of the default VPC is default. Normally, Terraform creates the specified resources if they do not already exist, but as mentioned in the documentation for default VPCs, this particular resource type aws_default_vpc is special in that all AWS accounts after 2013-12-04 already have a default VPC, so Terraform just detects the existing default VPC and manages it instead of trying to create a new one (which wouldn't make sense anyway).

To actually have Terraform manage it, we need to apply the configuration:

$ terraform apply
Enter fullscreen mode Exit fullscreen mode

terraform apply applies all configuration files in the current directory whose filenames end in .tf. Hence we created a dedicated directory to store our .tf files.

The above command prints out a summary of what resources it will create, modify and / or delete to reach the desired configuration and asks for confirmation before applying the changes. Type yes (all lowercase) and press Enter to confirm. In the end, you should see output similar to the following:

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
Enter fullscreen mode Exit fullscreen mode

Now save the following in a file security-group.tf:

resource "aws_security_group" "ec2_basic" {
  name        = "ec2_basic"
  description = "Security group for basic EC2 instance"
  vpc_id      = aws_default_vpc.default.id

  ingress {
    description      = "SSH from anywhere"
    from_port        = 22
    to_port          = 22
    protocol         = "tcp"
    cidr_blocks      = ["0.0.0.0/0"]
  }

  egress {
    from_port        = 0
    to_port          = 0
    protocol         = "-1"
    cidr_blocks      = ["0.0.0.0/0"]
    ipv6_cidr_blocks = ["::/0"]
  }
}
Enter fullscreen mode Exit fullscreen mode

This instructs Terraform (once applied) to create an aws_security_group with name ec2_basic if it does not exist already. Within the resource "aws_security_group" "ec2_basic" { ... } block, we have ingress { ... } and egress { ... } sub-blocks for specifying the rules for inbound and outbound network traffic respectively. Here, we allow inbound TCP connections on port 22 (the default port for SSH) from anywhere, and allow all outbound traffic. Also note this particular line:

vpc_id      = aws_default_vpc.default.id
Enter fullscreen mode Exit fullscreen mode

aws_default_vpc.default references the default VPC resource we "created" earlier. It exports the id attribute, so aws_default_vpc.default.id refers to the ID of the default VPC resource. We use this ID to associate our security group to the default VPC. This has the benefit that we don't have to hardcode the VPC ID so the configuration is portable across different AWS accounts, whose default VPCs likely have different IDs.

Before we apply, let's try another command plan, which computes what changes are needed to reach the desired configuration without actually applying them (basically, a dry run):

$ terraform plan
...
Terraform will perform the following actions:

  # aws_security_group.ec2_basic will be created
...
Enter fullscreen mode Exit fullscreen mode

Here, it says that the aws_security_group.ec2_basic resource will be created.

Now that we've confirmed everything looks right, let's apply the changes. Remember to anwer yes and enter:

$ terraform apply
Enter fullscreen mode Exit fullscreen mode

Next, save the following in a file key-pair.tf defining the public key for our key pair. Remember to replace the public key material with your public key:

resource "aws_key_pair" "macbook_air_ubuntu" {
  key_name   = "macbook-air-ubuntu"
  public_key = "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCzSI7uKPqNJTCO9uaqFvSdk+A/vmkwJ/ef9hjku8BEau+NTd89K54g4sZwwVsnEkM9ZkgQ7TqaHs6qhG0xJoc82h8w34uzVglHsSAsY1D2FtuhT4dtwUq7+E2ZFj454rH23rrb3haFdROBizOIMCs01WH0muNKhU9a1O9EQZ42oG6boJqbhw3jWErUFZn4MhtN1FZDx1U2YY6B2VfnfeJvJG819WDor+MypScoD+/Z5Vjt/H4pgMxJSlLk6HFSuop7HHnSH3UzsT+VamxVCOvuN0mx6xm7kOvYp4HfKGKTQMyb+KIOWmr7XG4zG6H0j7STiQVjnVYzfgxnlUg607+Sti02N2dU73SnSn4A7v2Fht+6UjWxMqbugjT+QHH30QV3mwGbjdxwvEqifOGeXK/Khn2FLcW4pbinHce7vOOUBb3+mY/8qRvJHn7aRnHSKTj0jAuE5dRgG1UcYmDVUhyZD+6/Jf7/vF4zHI7B6MA8GSxx9+QeGhSDSrr5rEgBh/0= donaldsebleung@donaldsebleung-MacBookAir"
}
Enter fullscreen mode Exit fullscreen mode

This defines a public key to be imported to AWS so we can connect to our instance later.

Check everything is fine and apply:

$ terraform plan
$ terraform apply
Enter fullscreen mode Exit fullscreen mode

Now we are ready to define our instance with the given security group and key pair. We'll use the AMI with ID ami-00056a28d6c5e916b which is an Ubuntu 20.04 LTS AMI. Save the following in a file instance.tf:

resource "aws_instance" "ec2_basic" {
  ami = "ami-00056a28d6c5e916b"
  instance_type = "t2.micro"
  vpc_security_group_ids = [aws_security_group.ec2_basic.id]
  key_name = aws_key_pair.macbook_air_ubuntu.key_name
}
Enter fullscreen mode Exit fullscreen mode

Notice again that we get the security group ID and key name from resources we have defined earlier, so as to not hardcode it, enabling the configuration to be re-used across different AWS accounts.

Make sure everything is okay and apply:

$ terraform plan
$ terraform apply
Enter fullscreen mode Exit fullscreen mode

The last few lines of output should look like this:

aws_instance.ec2_basic: Creating...
aws_instance.ec2_basic: Still creating... [10s elapsed]
aws_instance.ec2_basic: Still creating... [20s elapsed]
aws_instance.ec2_basic: Still creating... [30s elapsed]
aws_instance.ec2_basic: Still creating... [40s elapsed]
aws_instance.ec2_basic: Still creating... [50s elapsed]
aws_instance.ec2_basic: Creation complete after 55s [id=i-04509d0d13f076b03]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
Enter fullscreen mode Exit fullscreen mode

Good, our instance is created. But what is its public IP address? We'll need that to connect to our instance, but the output of terraform apply doesn't tell us what it is.

Of course, we can get information on our instance with the AWS CLI, but can we do it with Terraform alone? The answer is a resounding "yes" - enter outputs!

Save the following in a file output.tf:

output "ec2_basic_public_ip" {
  description = "Public IP of our created instance"
  value = aws_instance.ec2_basic.public_ip
}
Enter fullscreen mode Exit fullscreen mode

This defines an output value ec2_basic_public_ip that is printed to the console whenever we run terraform apply. The description field is just for documentation purposes - the important field is value which instructs the public IP of our EC2 instance to be reported.

Apply our configuration. Note that nothing should actually change since Terraform already created all desired resources on last apply, but we should see the public IP of our instance printed to the console:

$ terraform apply
...
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

Outputs:

ec2_basic_public_ip = "18.206.126.92"
Enter fullscreen mode Exit fullscreen mode

In my case, the public IP of my instance is 18.206.126.92 according to Terraform. Yours is likely different.

Now SSH to your instance, replacing 18.206.126.92 with the public IP of your instance:

$ ssh ubuntu@18.206.126.92
Enter fullscreen mode Exit fullscreen mode

Congratulations! You've provisioned an EC2 instance plus associated resources declaratively with Terraform, and connected to it successfully.

To clean up, simply run terraform destroy and enter yes when prompted, which is the inverse of apply and deletes all resources defined in the current directory*:

$ terraform destroy
Enter fullscreen mode Exit fullscreen mode

* Not quite. The default VPC cannot be deleted so it is not deleted. Instead, Terraform stops managing it on your behalf.

Contrast this approach to AWS CLI where you have to remember what resources you created and delete them one by one. When the number of resources is large, it can be easy to forget to delete something which could incur additional unexpected costs in the long run.

Now, you should understand why a declarative approach is favorable for managing infrastructure. It is these powerful automated tools that enables the practice of DevOps, where everything happens at a rapid pace.

Automatically customize EC2 instance with Terraform by running a script on first boot

Reference: Run commands on your Linux instance at launch

While being able to provision a fresh EC2 instance declaratively with Terraform is useful in its own right, oftentimes we might want to perform additional setup on the EC2 instance once it is created, and it might be cumbersome to do it manually especially if the setup is complex, e.g. we might want to set up a full-fledged LAMP stack on the EC2 instance in a uniform manner.

Fortunately, we can combine Terraform with Amazon EC2 user data to achieve automated setup of our newly created EC2 instance.

Assuming you're still in the ec2-basic directory, first move up to the parent directory:

$ cd ..
Enter fullscreen mode Exit fullscreen mode

Then make a copy of this directory as ec2-website. We'll see how to specify user data in Terraform to instruct it to automatically execute a bash script on the newly created EC2 instance on first boot, which will install an HTTPS web server:

$ cp -r ec2-basic ec2-website
Enter fullscreen mode Exit fullscreen mode

Enter the ec2-website directory:

$ cd ec2-website
Enter fullscreen mode Exit fullscreen mode

Now modify two files:

  • security-group.tf: we'll add an inbound rule to allow HTTPS traffic from anywhere
  • instance.tf: we'll add a bash script to be executed on first boot as user data

security-group.tf:

resource "aws_security_group" "ec2_basic" {
  name        = "ec2_basic"
  description = "Security group for basic EC2 instance"
  vpc_id      = aws_default_vpc.default.id

  ingress {
    description      = "SSH from anywhere"
    from_port        = 22
    to_port          = 22
    protocol         = "tcp"
    cidr_blocks      = ["0.0.0.0/0"]
  }

  ingress {
    description = "HTTPS from anywhere"
    from_port = 443
    to_port = 443
    protocol = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port        = 0
    to_port          = 0
    protocol         = "-1"
    cidr_blocks      = ["0.0.0.0/0"]
    ipv6_cidr_blocks = ["::/0"]
  }
}
Enter fullscreen mode Exit fullscreen mode

instance.tf:

resource "aws_instance" "ec2_basic" {
  ami = "ami-00056a28d6c5e916b"
  instance_type = "t2.micro"
  vpc_security_group_ids = [aws_security_group.ec2_basic.id]
  key_name = aws_key_pair.macbook_air_ubuntu.key_name
  user_data = <<EOT
#!/bin/bash

apt-get update
apt-get install -y software-properties-common
yes "" | add-apt-repository ppa:donaldsebleung/misc
apt-get update
apt-get install -y donaldsebleung-com
yes "" | openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -sha256 -days 365 -nodes
mv key.pem /etc/donaldsebleung-com
mv cert.pem /etc/donaldsebleung-com
systemctl enable --now donaldsebleung-com.service
EOT
}
Enter fullscreen mode Exit fullscreen mode

Initialize the project just to be safe, then apply the config, entering yes when prompted:

$ terraform init
$ terraform apply
Enter fullscreen mode Exit fullscreen mode

After the apply step completes, wait for a few extra minutes, then visit https://<EC2_INSTANCE_PUBLIC_IP> where <EC2_INSTANCE_PUBLIC_IP> is the public IP of the instance reported by Terraform, ignoring warnings from your browser about a self-signed certificate. You should see the following page:

My personal website on EC2 instance with Terraform

Feel free to poke around to learn more about me (shameless promotion here :-P). When you're done, tear down the infrastructure:

$ terraform destroy
Enter fullscreen mode Exit fullscreen mode

Packer

Reference: Packer by HashiCorp

We saw how Terraform could be used to provision an EC2 instance and customize it on first boot by executing a bash script from user data, which is amazing and saves a lot of manual work from the administrator. But consider a scenario where the initial setup used to customize the EC2 instance takes a long time to run to completion. Now suppose this EC2 instance encounters a kernel panic on a particular day, and no amount of reboots will solve it. The instance would have to be destroyed and re-created, which would take a long time before the new instance arrives at a functional state, i.e. whatever services that instance offers would be down for a long time, causing prolonged negative impact to business. Instead of wasting time (automatically) customizing each EC2 instance after it has been created, what if we could customize the EC2 instance up front, save that new state and launch all subsequent instances from that new state? Then, whenever our instance(s) are down for whatever reason, we can re-launch a new instance with everything already set up and running.

Enter Packer. Packer is another offering by HashiCorp that deals with the automated provisioning of customized images, also released under the MPL 2.0. With Packer, we can use a base image and bake our customized setup into it to form a new image, which can then be used to provision EC2 instances with Terraform, such that we only have to run the setup once instead of on every instance launch.

Installing Packer

Reference: Download Packer

$ wget https://releases.hashicorp.com/packer/1.7.8/packer_1.7.8_linux_amd64.zip
$ unzip packer_1.7.8_linux_amd64.zip
$ rm packer_1.7.8_linux_amd64.zip
$ sudo mv packer /usr/local/bin
Enter fullscreen mode Exit fullscreen mode

Check the version:

$ packer --version
Enter fullscreen mode Exit fullscreen mode

Print a help message:

$ packer --help
Enter fullscreen mode Exit fullscreen mode

Baking a customized AMI with Packer and using it to launch a customized EC2 instance with Terraform

Reference: Getting Started with AWS

Like Terraform, Packer also uses HCL as its configuration language.

Assuming you're still in the ec2-website directory, first move up to the parent directory:

$ cd ..
Enter fullscreen mode Exit fullscreen mode

Now create a directory ec2-custom-packer and cd into it:

$ mkdir ec2-custom-packer
$ cd ec2-custom-packer
Enter fullscreen mode Exit fullscreen mode

Save the following into a file ec2-custom-packer.pkr.hcl:

packer {
  required_plugins {
    amazon = {
      version = ">= 0.0.2"
      source  = "github.com/hashicorp/amazon"
    }
  }
}

variable "ami_prefix" {
  type    = string
  default = "donaldsebleung-com"
}

locals {
  timestamp = regex_replace(timestamp(), "[- TZ:]", "")
}

source "amazon-ebs" "donaldsebleung-com" {
  ami_name      = "${var.ami_prefix}-${local.timestamp}"
  instance_type = "t2.micro"
  region        = "us-east-1"
  source_ami    = "ami-008569888adb8f3e8"
  ssh_username  = "ubuntu"
}

build {
  sources = ["source.amazon-ebs.donaldsebleung-com"]
  provisioner "shell" {
    inline = [
      "sudo apt-get update",
      "sudo apt-get install -y software-properties-common",
      "yes \"\" | sudo add-apt-repository ppa:donaldsebleung/misc",
      "sudo apt-get update",
      "sudo apt-get install -y donaldsebleung-com",
      "yes \"\" | openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -sha256 -days 365 -nodes",
      "sudo mv key.pem /etc/donaldsebleung-com",
      "sudo mv cert.pem /etc/donaldsebleung-com",
      "sudo systemctl enable --now donaldsebleung-com.service"
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

A brief explanation of the blocks involved:

  • The packer { ... } block specifies the plugin to be installed for creating AWS AMIs, similar to how Terraform requires providers for provisioning AWS infrastructure. Here, we use the amazon plugin from github.com/hashicorp/amazon, i.e. an official plugin provided by HashiCorp
  • We define an ami_prefix variable and generate a timestamp to construct a unique name for our new AMI, since each of our AWS AMIs must have unique names. You can learn more about timestamp generation through the docs
  • The source { ... } block specifies the base AMI and instance type used to build our new customized AMI, as well as what user to log in as through SSH to run the build script. Note that we don't have to provide our SSH public key since Packer auto-generates (and deletes afterwards) an SSH key pair for this purpose
  • The build { ... } block defines the actual build. The source block above doesn't actually do anything on its own, but is used for the build when referenced. Here, we also define a shell provisioner which specifies the shell commands to run, one by one

Now initialize the current directory, which installs the required plugin(s):

$ packer init .
Enter fullscreen mode Exit fullscreen mode

Before we perform the build, let's also validate our Packer config:

$ packer validate .
The configuration is valid.
Enter fullscreen mode Exit fullscreen mode

Now let's build our image. This make take around 10 minutes or so:

$ packer build .
Enter fullscreen mode Exit fullscreen mode

If successful, you should see the following output near the end:

==> Wait completed after 9 minutes 48 seconds

==> Builds finished. The artifacts of successful builds are:
--> amazon-ebs.donaldsebleung-com: AMIs were created:
us-east-1: ami-0cf4e6ec23e69dd66
Enter fullscreen mode Exit fullscreen mode

Note down the AMI ID, in my case ami-0cf4e6ec23e69dd66. Also note that unlike Terraform, Packer doesn't manage the AMI once it's built so you have to manually delete the AMI yourself once you're done with it.

Move back up to the parent directory:

$ cd ..
Enter fullscreen mode Exit fullscreen mode

Make a copy of ec2-website as ec2-custom-terraform and cd into it:

$ cp -r ec2-website ec2-custom-terraform
$ cd ec2-custom-terraform
Enter fullscreen mode Exit fullscreen mode

Now modify instance.tf by replacing the AMI ID with that of your custom image built with Packer, and remove the user_data.

Initialize and apply:

$ terraform init
$ terraform apply
Enter fullscreen mode Exit fullscreen mode

Once the last step completes, visit https://<EC2_PUBLIC_IP> immediately. The site should already be up and running.

Tear down the infrastructure:

$ terraform destroy
Enter fullscreen mode Exit fullscreen mode

Finally, unless you intend to keep the custom AMI around, de-register it with AWS CLI, replacing the image ID as approrpiate:

$ aws ec2 deregister-image --image-id ami-0cf4e6ec23e69dd66
Enter fullscreen mode Exit fullscreen mode

Conclusion

We learnt:

  • What IaC is, its declarative approach to managing infrastructure and its advantages over traditional techniques
  • What Terraform is and how to use it to automatically set up and tear down AWS infrastructure
  • How to customize EC2 instances with setup scripts through user data, and apply them with Terraform
  • What Packer is, how it alleviates the issue of long setup time of customized EC2 instances through Terraform alone, how to build a custom AMI with Packer for provisioning custom instances on the get-go with Terraform

The interested reader is encouraged to consult further resources:

I hope you enjoyed the article :-)

References

Discussion (1)

Collapse
donaldsebleung profile image
Donald Sebastian Leung Author

For a more detailed exposition on IaC, its advantages, drawbacks and considerations, one may wish to consult the article What Is Infrastructure as Code? Examples, Best Practices & Tools by Spacelift.