DEV Community

Cover image for Creating an Auto-Scaling Web Server Architecture
Andy Tran
Andy Tran

Posted on

Creating an Auto-Scaling Web Server Architecture

Since completing the AWS Cloud Resume Challenge, I've been more curious about Terraform. Today, I'll be using Terraform to create AWS architecture, containing Public Subnets, Private Subnets, Application Load Balancer (ALB), and Auto Scaling Group (ASG) for EC2 instances. The ASG scale instances up or down based on specific CPU usage thresholds.

This type of process is crucial when trying to cut costs for a business.

To start the project, I created another repository on Github and cloned it to my local computer.

I created a main.tf file:

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}
provider "aws" {
  region     = "us-east-1"
}
Enter fullscreen mode Exit fullscreen mode

I made sure to define my environment variables in the .bashrc file.

Run:

  • nano ~/.bashrc

and define your variables

export AWS_ACCESS_KEY_ID = "<your aws user access key>"
export AWS_SECRET_ACCESS_KEY = "<your aws user secret key>"
Enter fullscreen mode Exit fullscreen mode

After saving the file, the file needs to be reloaded for the variables to be accessible.

To re-load run:

  • source ~/.bashrc

The variables have to be defined whenever a new bash session is created.

Defining the varuables in the bashrc script means we can remove these lines from our file:

access_key = "AWS_ACCESS_KEY_ID"
secret_key = "AWS_SECRET_ACCESS_KEY"
Enter fullscreen mode Exit fullscreen mode

because Terraform is able to pull your AWS credentials directly from the .bashrc script.

To create a vpc, add this to main.tf:

# Create a VPC
resource "aws_vpc" "example" {
  cidr_block = "10.0.0.0/16"
}
Enter fullscreen mode Exit fullscreen mode

After running commands:

  • terraform init
  • terraform apply

I see that Terraform as completed creating my VPC.

Image description

I check my console to make sure it was created.

Image description

The ID's match up so Terraform is configured correctly. One thing to note, the name "example" is just an identifier for the resource by Terraform. If we want to name the VPC we would have to include a tag for the resource.

resource "aws_vpc" "example" {
  cidr_block = "10.0.0.0/16"

  tags = {
    Name = "example-vpc"
  }
}
Enter fullscreen mode Exit fullscreen mode

We can see here, that we don't have any subnets. We want to make 3 public and 3 private subnets

Here is how to implement them

# Subnets
resource "aws_subnet" "public_1" {
  vpc_id            = aws_vpc.example.id
  cidr_block        = "10.0.1.0/24"
  availability_zone = "us-east-1a"
  map_public_ip_on_launch = true
}

resource "aws_subnet" "public_2" {
  vpc_id            = aws_vpc.example.id
  cidr_block        = "10.0.2.0/24"
  availability_zone = "us-east-1b"
  map_public_ip_on_launch = true
}

resource "aws_subnet" "public_3" {
  vpc_id            = aws_vpc.example.id
  cidr_block        = "10.0.3.0/24"
  availability_zone = "us-east-1c"
  map_public_ip_on_launch = true
}

resource "aws_subnet" "private_1" {
  vpc_id            = aws_vpc.example.id
  cidr_block        = "10.0.4.0/24"
  availability_zone = "us-east-1a"
}

resource "aws_subnet" "private_2" {
  vpc_id            = aws_vpc.example.id
  cidr_block        = "10.0.5.0/24"
  availability_zone = "us-east-1b"
}

resource "aws_subnet" "private_3" {
  vpc_id            = aws_vpc.example.id
  cidr_block        = "10.0.6.0/24"
  availability_zone = "us-east-1c"
}

Enter fullscreen mode Exit fullscreen mode

Having multiple subnets in different availability zones provides high availability in case EC2 instances are shutdown for any reason.

Note that this line that the subnets are created in the correct VPC with this line

vpc_id            = aws_vpc.example.id
Enter fullscreen mode Exit fullscreen mode

The "example" is just the variable name we provided for our VPC earlier.

Next, I created an internet gateway

# Internet Gateway
resource "aws_internet_gateway" "main" {
  vpc_id = aws_vpc.example.id
}

Enter fullscreen mode Exit fullscreen mode

Next I create a route table and configure outbound traffic to be directed to the internet gateway that was just created.

# Route Table for Public Subnets
resource "aws_route_table" "public" {
  vpc_id = aws_vpc.example.id

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.main.id
  }
}

# Route Table Associations for Public Subnets
resource "aws_route_table_association" "public_1" {
  subnet_id      = aws_subnet.public_1.id
  route_table_id = aws_route_table.public.id
}

resource "aws_route_table_association" "public_2" {
  subnet_id      = aws_subnet.public_2.id
  route_table_id = aws_route_table.public.id
}

resource "aws_route_table_association" "public_3" {
  subnet_id      = aws_subnet.public_3.id
  route_table_id = aws_route_table.public.id
}

Enter fullscreen mode Exit fullscreen mode

The Route Table Associations resources associates the route table with the 3 public subnets.

So to summarize

  • An internet gateway was created to connext the VPC to the internet.

  • The route table was created make all outbound traffic direct towards the internet gateway.

  • The aws_route_table_association resources link the public subnets to the route table. This ensures that traffic from instances within the subnets is directed to the internet gateway.

Now, we have to create a security group

# Security Group
resource "aws_security_group" "web" {
  vpc_id = aws_vpc.example.id

  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

Enter fullscreen mode Exit fullscreen mode

The security group is specified as "web", and configured to the "example" vpc.

The ingress rules allows incoming traffic on port 80 and specifies the TCP protocol. The cidr is specified to "0.0.0.0/0" so it will allow incoming HTTP traffic from anywhere

The egress rule allows all outbound traffic from the instances associated with this security group. This is a common default setting that permits instances to initiate connections to any destination.

Next we specify a User Data script

# EC2 User Data Script
data "template_file" "userdata" {
  template = <<-EOF
              #!/bin/bash
              yum update -y
              yum install -y httpd
              systemctl start httpd
              systemctl enable httpd
              echo "Hello World from $(hostname -f)" > /var/www/html/index.html
            EOF
}
Enter fullscreen mode Exit fullscreen mode

The user data script is used to bootstrap the EC2 instance with necessary configurations and software installations when it first starts. In this case, it installs and configures an Apache web server and sets up a simple "Hello World" web page.

# Launch Configuration
resource "aws_launch_configuration" "web" {
  name          = "web-launch-configuration"
  image_id      = "ami-0b72821e2f351e396" # Amazon Linux 2 AMI
  instance_type = "t2.micro"
  security_groups = [aws_security_group.web.id]

  user_data = data.template_file.userdata.rendered

  lifecycle {
    create_before_destroy = true
  }
}
Enter fullscreen mode Exit fullscreen mode

This Terraform configuration defines an AWS Launch Configuration named "web-launch-configuration" for creating EC2 instances with specific settings. It specifies the use of the Amazon Linux 2 AMI (identified by the image_id "ami-0c55b159cbfafe1f0") and sets the instance type to "t2.micro". The EC2 instances launched with this configuration will use the security group referenced by aws_security_group.web.id. Additionally, a user data script, defined in the template_file data source, will be executed upon instance launch to install and start a web server. The lifecycle block ensures that new instances are created before the old ones are destroyed during updates, minimizing downtime.

# Auto Scaling Group
resource "aws_autoscaling_group" "web" {
  vpc_zone_identifier = [aws_subnet.private_1.id, aws_subnet.private_2.id, aws_subnet.private_3.id]
  launch_configuration = aws_launch_configuration.web.id
  min_size             = 1
  max_size             = 3
  desired_capacity     = 1

  tag {
    key                 = "Name"
    value               = "web"
    propagate_at_launch = true
  }
}
Enter fullscreen mode Exit fullscreen mode

This Auto Scaling Group specifies that EC2 instances should be launched in the identified three private subnets. It maintains a minimum of 1 instance, scales up to a maximum of 3 instances based on scaling policies, and starts with a desired capacity of 1 instance. The instances are launched using the specified launch configuration.

# Application Load Balancer
resource "aws_lb" "web" {
  name               = "web-alb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.web.id]
  subnets            = [aws_subnet.public_1.id, aws_subnet.public_2.id, aws_subnet.public_3.id]
}

resource "aws_lb_target_group" "web" {
  name        = "web-tg"
  port        = 80
  protocol    = "HTTP"
  vpc_id      = aws_vpc.example.id
  target_type = "instance"
}

resource "aws_lb_listener" "web" {
  load_balancer_arn = aws_lb.web.arn
  port              = "80"
  protocol          = "HTTP"

  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.web.arn
  }
}
Enter fullscreen mode Exit fullscreen mode

This Terraform configuration sets up an Application Load Balancer (ALB) named "web-alb" that is publicly accessible (internal = false) and uses the specified security group and public subnets. It also creates a target group named "web-tg" to route HTTP traffic on port 80 to instances within the specified VPC, and an ALB listener that listens for HTTP traffic on port 80, forwarding it to the target group. This configuration ensures that incoming HTTP traffic is balanced across the EC2 instances registered in the target group.

resource "aws_autoscaling_attachment" "asg_attachment" {
  autoscaling_group_name = aws_autoscaling_group.web.name
  lb_target_group_arn   = aws_lb_target_group.web.arn
}
Enter fullscreen mode Exit fullscreen mode

The above resource attaches the ASG to the ALB's target group. This makes sure that the instances managed by the ASG are automatically registered with the ALB.


Next are two CloudWatch Alarms. These alarms trigger if CPU usage is over 75% or below 20% for longer than 30 seconds.


# CloudWatch Alarms
resource "aws_cloudwatch_metric_alarm" "high_cpu" {
  alarm_name                = "high-cpu-utilization"
  comparison_operator       = "GreaterThanThreshold"
  evaluation_periods        = "2"
  metric_name               = "CPUUtilization"
  namespace                 = "AWS/EC2"
  period                    = "30"
  statistic                 = "Average"
  threshold                 = "75"
  alarm_actions             = [aws_autoscaling_policy.scale_out.arn]
  dimensions = {
    AutoScalingGroupName = aws_autoscaling_group.web.name
  }
}

resource "aws_cloudwatch_metric_alarm" "low_cpu" {
  alarm_name                = "low-cpu-utilization"
  comparison_operator       = "LessThanThreshold"
  evaluation_periods        = "2"
  metric_name               = "CPUUtilization"
  namespace                 = "AWS/EC2"
  period                    = "30"
  statistic                 = "Average"
  threshold                 = "20"
  alarm_actions             = [aws_autoscaling_policy.scale_in.arn]
  dimensions = {
    AutoScalingGroupName = aws_autoscaling_group.web.name
  }
}
Enter fullscreen mode Exit fullscreen mode

In this last line

dimensions = {
    AutoScalingGroupName = aws_autoscaling_group.web.name
  }
Enter fullscreen mode Exit fullscreen mode

We are basically telling the alarm to monitor the instances in this specific ASG.

Notice that we specified alarm_actions here to specific Auto Scaling Policies:

alarm_actions             = [aws_autoscaling_policy.scale_in.arn]
Enter fullscreen mode Exit fullscreen mode

and here

alarm_actions             = [aws_autoscaling_policy.scale_out.arn]
Enter fullscreen mode Exit fullscreen mode

These policies will now be created below, and are triggered when their associated CloudWatch Alarm is triggered.

# Auto Scaling Policies
resource "aws_autoscaling_policy" "scale_out" {
  name                   = "scale_out"
  scaling_adjustment     = 1
  adjustment_type        = "ChangeInCapacity"
  cooldown               = 30
  autoscaling_group_name = aws_autoscaling_group.web.name
}

resource "aws_autoscaling_policy" "scale_in" {
  name                   = "scale_in"
  scaling_adjustment     = -1
  adjustment_type        = "ChangeInCapacity"
  cooldown               = 30
  autoscaling_group_name = aws_autoscaling_group.web.name
}

Enter fullscreen mode Exit fullscreen mode

Launching

To launch we perform:

  • terraform init
  • terraform plan
  • terraform apply

Now checking the VPC, we see that it has the public and private subnets with the route tables.

Image description

Navigating to EC2, we see that the ASG is correctly configured

Image description

And an EC2 instance is live

Image description

Testing

I edited the EC2 user data script to install "stress" so once the instance, I can test the ASG automatically by driving up the CPU usage for a minute, and then stopping.

# EC2 User Data Script
data "template_file" "userdata" {
  template = <<-EOF
              #!/bin/bash
              yum update -y
              yum install -y epel-release
              yum install -y stress
              yum install -y httpd
              systemctl start httpd
              systemctl enable httpd
              systemctl enable amazon-ssm-agent
              systemctl start amazon-ssm-agent
              echo "Hello World from $(hostname -f)" > /var/www/html/index.html
              # Run stress for 1 minute to simulate high CPU usage
              stress --cpu 1 --timeout 60
            EOF
}
Enter fullscreen mode Exit fullscreen mode

Another way to do this, is to SSH directly into your EC2 instance. To do this, we would have to make sure the instances have access to the internet from the private subnets.

# Elastic IP for NAT Gateway
resource "aws_eip" "nat_eip" {
  vpc = true
}

# NAT Gateway in Public Subnet
resource "aws_nat_gateway" "nat_gw" {
  allocation_id = aws_eip.nat_eip.id
  subnet_id     = aws_subnet.public_1.id
}

# Route Table for Private Subnets
resource "aws_route_table" "private" {
  vpc_id = aws_vpc.example.id

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_nat_gateway.nat_gw.id
  }
}

# Route Table Associations for Private Subnets
resource "aws_route_table_association" "private_1" {
  subnet_id      = aws_subnet.private_1.id
  route_table_id = aws_route_table.private.id
}

resource "aws_route_table_association" "private_2" {
  subnet_id      = aws_subnet.private_2.id
  route_table_id = aws_route_table.private.id
}

resource "aws_route_table_association" "private_3" {
  subnet_id      = aws_subnet.private_3.id
  route_table_id = aws_route_table.private.id
}

Enter fullscreen mode Exit fullscreen mode

By adding a NAT Gateway and updating the route table for the private subnets, we enable instances in the private subnets to access the internet for outbound traffic while remaining protected from inbound internet traffic.

Now running the terraform apply will update our resources.

Monitoring the CloudWatch Alarms, we see that the CPU usage shoots up right away, triggering the "high_cpu_utilization" alarm because of the script we assign the EC2 instances

Image description

And here we see that a second EC2 instance is created by the ASG

Image description

Once the stress command is timed-out after 300 seconds, the CPU usage drops down below 20% and triggers the "low_cpu_utilization" alarm

Image description

And then the ASG terminates the us-east-1c EC2 instance, leaving only the instance in us-east-1a

Image description

And that's it for this project! We were able to successfully use Terraform to create an entire AWS Auto-Scaling Web Server architecture and test it ourselves.

Here is the Github repo if you want to try it out for yourself.

Note One thing I wasn't able to do yet, was ssh into the EC2 instances to manually test them, but I kept getting timed out.This is why I scripted the instances to run "stress" automatically on their creation.

Top comments (0)