Introduction
As businesses and applications grow, the demand for computing power fluctuates. Traditionally, managing this involved investing in expensive hardware and requiring manual interventions to scale. However, with cloud computing, you now have the ability to dynamically scale resources based on real-time needs. Auto Scaling ensures that your applications remain available and responsive, optimizing both cost and performance by automatically adjusting compute resources to handle varying traffic loads.
A key feature of AWS’s elasticity is the ability to use Auto Scaling Groups (ASGs) to automatically add or remove compute resources based on demand. This allows your applications to stay highly available even during traffic spikes while also scaling down during periods of low activity to save costs.
In this tutorial, you'll set up the foundation for a scalable architecture by creating a Virtual Private Cloud (VPC) with three public subnets spread across different availability zones (AZs). This ensures high availability by distributing resources across isolated locations, reducing the risk of downtime due to failures in a single zone. After configuring the VPC, you'll introduce an Auto Scaling Group and an Application Load Balancer (ALB) to distribute traffic across multiple EC2 instances, ensuring that your infrastructure dynamically adapts to changes in traffic demand.
By the end of this guide, you will learn how to:
- Create a VPC with public subnets in different availability zones for high availability.
- Launch EC2 instances within those subnets running a web application (Apache web server).
- Set up an Application Load Balancer (ALB) to distribute traffic across instances.
- Configure Auto Scaling Groups (ASG) to automatically scale instances based on CPU utilization.
- Test Auto Scaling by simulating high traffic to observe how your infrastructure adapts.
Requirements
Before getting started, ensure that you have the following prerequisites in place to successfully implement Auto Scaling and Load Balancing for your web application. First, you'll need an active AWS account, which you can set up by creating a free-tier AWS account. The free-tier provides limited, cost-free access to various AWS services that will be used in this guide. Additionally, it's helpful to have a basic understanding of AWS console to follow along with the configurations more effectively. Finally, for visualizing your infrastructure setup, you can use draw.io, an easy-to-use online tool for creating architecture diagrams that can help you map out your environment.
Architecture Overview
Before diving into the setup, here’s a high-level diagram of the architecture you’ll be building:
The components of the architecture include:
-
A VPC with a CIDR block of
10.0.0.0/16
: This provides an isolated network environment for your resources in AWS. -
Three Public Subnets: Each subnet resides in a different availability zone (AZ) to ensure high availability and fault tolerance.
-
Public Subnet 1:
10.0.10.0/24
-
Public Subnet 2:
10.0.20.0/24
-
Public Subnet 3:
10.0.30.0/24
-
Public Subnet 1:
- An Internet Gateway (IGW): Allows instances within the public subnets to access the internet.
- A Route Table: Configures routing for internet-bound traffic from the subnets to the Internet Gateway.
- EC2 Instances: These instances will run an Apache web server, deployed across the public subnets to handle web traffic.
- An Auto Scaling Group (ASG): Automatically adjusts the number of EC2 instances based on traffic demand, using CPU utilization as a scaling metric.
- An Application Load Balancer (ALB): Distributes incoming traffic across the EC2 instances to ensure load balancing and even request distribution.
- A Target Group: Registers the EC2 instances behind the ALB and defines health check parameters, ensuring that only healthy instances receive traffic.
Step 1: Creating the VPC
Accessing the VPC Dashboard
To begin, you’ll create a VPC based on the architecture diagram above.
- Log into AWS and navigate to the VPC Dashboard by selecting Services > VPC.
- On the left-hand menu, click Create VPC.
Step 2: Configure the VPC
You’ll now configure the VPC to form the core network environment for your resources.
- On the Create VPC page, select the VPC and more option.
- Under VPC settings:
-
Name Tag: Enter a name for your VPC (e.g.,
WebappVPC
). -
IPv4 CIDR Block: Enter
10.0.0.0/16
to give the VPC 65,536 IP addresses. - IPv6 CIDR Block: Leave this option disabled unless IPv6 is required.
-
Tenancy: Select
Default
unless dedicated hardware is needed.
-
Name Tag: Enter a name for your VPC (e.g.,
Step 3: Configure Subnets
Next, you will configure three public subnets, each in a different availability zone for high availability.
-
Number of Availability Zones (AZs): Choose
3
to ensure high availability. -
Number of Public Subnets: Enter
3
. -
Customize Subnet CIDR Blocks:
-
Public Subnet 1:
10.0.10.0/24
inus-east-1a
-
Public Subnet 2:
10.0.20.0/24
inus-east-1b
-
Public Subnet 3:
10.0.30.0/24
inus-east-1c
-
Public Subnet 1:
These subnets will reside in different availability zones to ensure high availability across your infrastructure.
Step 4: Configure Additional Resources
Route Table and Internet Gateway
When using the VPC wizard (the VPC and More option), AWS automatically handles the creation of route tables and an Internet Gateway. The wizard not only generates the required route table and Internet Gateway for your VPC but also sets up the necessary routes for internet-bound traffic. This ensures that instances in the public subnets can communicate with the internet.
Step 5: Review and Launch
After configuring the VPC, subnets, Internet Gateway, and route table, you’re ready to launch the VPC.
- Review all the configurations on the Summary page.
- Click Create VPC to launch the network environment.
Your VPC is now created and ready for use, laying the foundation for a highly available, scalable web application architecture.
Section 2: Deploying an EC2 Instance
With the VPC setup complete, it's time to launch EC2 instances into one of the public subnets. These instances will serve as web servers for hosting your applications, allowing you to test connectivity and functionality within your newly created VPC.
Step 1: Launch an EC2 Instance
To launch an EC2 instance, follow these steps:
1. Navigate to the EC2 Dashboard
- Log into the AWS Management Console.
- Select EC2 from the list of services.
- Click on Launch Instance.
2. Configure the Instance Details
-
Name and Tags: Assign a meaningful name to your instance (e.g.,
WebServer
). -
AMI Selection: Choose the Amazon Linux 2023 AMI. Amazon Linux is free-tier eligible and supports package management through
yum
. - Instance Type: Select t2.micro, which is free-tier eligible and suitable for testing and small applications.
Step 2: Key Pair Setup
A key pair is required to securely access your instance via SSH. If you don’t have an existing key pair, create a new one:
-
Key Pair: Choose an existing key pair or create a new one.
- If creating a new key pair:
- Enter a name for the key pair.
- Download the
.pem
file and store it securely. You will need this file to SSH into your EC2 instance later.
- If creating a new key pair:
Step 3: Configure Network Settings
-
VPC: Select the VPC you created in the previous section (e.g.,
WebappVPC
). -
Subnet: Choose Public Subnet 1 (e.g.,
10.0.10.0/24
). This ensures that your instance has internet access via the Internet Gateway. - Auto-assign Public IP: Ensure that this option is enabled. This automatically assigns a public IP to the instance, which is essential for testing your web server over the internet.
Step 4: Configure Security Groups
Security groups control inbound and outbound traffic to your instance. Here's how to configure them:
-
Create or Choose a Security Group:
- Under the Network Settings section, select Create security group.
- Name it
webapp-security-group
. -
Inbound Rules:
- Allow SSH (Port 22): Enables SSH access to your instance.
-
Source: For enhanced security, restrict to your IP address. For learning purposes, you can set it to
0.0.0.0/0
to allow access from anywhere. - Allow HTTP (Port 80): Permits web traffic to your Apache server.
-
Source: Set to
0.0.0.0/0
to allow access from the internet.
Outbound Rules: By default, the security group allows all outbound traffic. No changes are necessary unless specific outbound restrictions are required.
Step 5: User Data Configuration
Add the following User Data script under Advanced Details. This script will automatically install and start the Apache web server when the instance launches:
#!/bin/bash
yum update -y
yum install -y httpd
systemctl start httpd
systemctl enable httpd
echo "<h1>Hello world! This is $(hostname -f)</h1>" > /var/www/html/index.html
This script:
- Updates the instance's packages.
- Installs the Apache web server (
httpd
). - Starts the Apache service and enables it to start automatically on boot.
- Creates a simple HTML page that displays the instance's hostname.
7. Launch the Instance: After configuring the above settings, click Launch Instance.
8. Testing the EC2 Instance
Once the instance has launched, you can test the setup by selecting the WebServer
instance and copying its public IP address.
-
Testing the Apache Web Server:
- Open a web browser and navigate to the instance's public IP.
- You should see the message:
Hello world! This is <hostname>
displayed on the webpage, confirming that Apache is running and serving the HTML file created using the User Data script.
Summary of What We've Done So Far
At this stage, we've successfully set up a single EC2 instance running an Apache web server within a public subnet of our VPC. By accessing this instance through its public IP, we confirmed that the web server is running and serving content. However, all incoming traffic currently hits just this one instance, meaning that if demand increases beyond its capacity, performance could degrade, or the server might even become unresponsive.
To address this, the next step is to set up Auto Scaling. This will allow us to automatically launch new EC2 instances as demand increases and terminate them when traffic reduces, ensuring that our application remains responsive and cost-efficient. We'll also configure an Application Load Balancer (ALB) to distribute traffic across multiple instances, ensuring even distribution and redundancy.
Section 3: Setting Up Auto Scaling and Load Balancing
Now that you’ve successfully deployed an EC2 instance, the next step is to implement Auto Scaling and an Application Load Balancer (ALB). This will ensure that your architecture dynamically scales in response to traffic demands, maintaining optimal performance and fault tolerance.
We’ll start by creating the Application Load Balancer (ALB), which will distribute incoming traffic across multiple instances. Then, we’ll configure the Auto Scaling Group (ASG) to automatically add or remove instances based on CPU utilization. For this tutorial, we'll use the default region us-east-1.
Setting Up Auto Scaling
1. Create a New Target Group
The ALB uses a Target Group to route traffic to specific EC2 instances. Let’s set one up:
-
Target Group Name: Enter a name for the target group (e.g.,
WebApp-TG
). -
Target Type: Select
Instance
. -
Protocol: Choose
HTTP
. -
Port: Enter
80
(for HTTP traffic). -
VPC: Choose the same VPC (
WebAppVPC
). - Health Checks: Leave the default settings. The ALB will regularly check the health of your instances to ensure that only healthy instances receive traffic. Click Next and proceed to add targets.
2. Register Targets
-
Register Existing Instances: Manually register your existing
webapp-server
EC2 instance. The Auto Scaling Group (ASG) will handle this automatically in the future. Click Next to proceed to registering targets and creating the Target Group.
3. Review and Create
- Review your Target Group configuration.
- Click Create Target Group to finalize the setup.
- Now that your Target Group is created, you can proceed to create your load balancer, which will route traffic to the instances in your Target Group.
Setting Up an Application Load Balancer (ALB)
An ALB is essential for distributing traffic across multiple instances, ensuring that no single instance becomes overwhelmed. Here’s how to set it up:
1. Access the EC2 Dashboard
- In the AWS Management Console, go to the EC2 Dashboard.
- On the left-hand side, under Load Balancing, select Load Balancers.
2. Create a New Load Balancer
- Click on Create Load Balancer and select Application Load Balancer.
3. Configure Load Balancer Settings
-
Name: Enter a name for your load balancer (e.g.,
WebApp-ALB
). -
Scheme: Choose
Internet-facing
to make the ALB accessible over the internet. -
IP Address Type: Select
IPv4
. -
VPC: Choose the VPC you created earlier (e.g.,
WebAppVPC
). -
Availability Zones: Select the availability zones and subnets where your EC2 instances are deployed (in this case, the three public subnets you created:
10.0.10.0/24
,10.0.20.0/24
, and10.0.30.0/24
).
4. Configure Security Settings
Since you’re working with HTTP (port 80), there’s no need to configure an SSL certificate. You can skip this section for now. However, in production environments, you should consider enabling SSL for secure connections (HTTPS).
5. Configure Security Groups
- Select or create a security group for your ALB. Ensure that it allows inbound HTTP (port 80) traffic from anywhere (
0.0.0.0/0
).
Once your ALB is created, it will start listening for HTTP requests and distributing them across instances via the Auto Scaling Group. However, before that, you need to create an AMI for your launch template that you will use in setting up your Auto Scaling Group.
Creating an AMI from Your Running EC2 Instance
- Go to EC2 Dashboard: Navigate to Instances and select your running instance.
2.1. Actions: Click on Actions > Image and templates > Create Image.
2.2. Image Name: Provide a name for your AMI (e.g., webapp-server-image
).
2.3. Instance Volumes: Review the instance storage configuration.
2.4. Create Image: Click Create Image to start the process.
2.5. Find AMI ID: Once the image is created, go to AMIs in the EC2 Dashboard to find your new AMI ID. You'll use this ID when configuring your launch template.
This AMI will be the base image for your Auto Scaling instances. Now that you have the AMI, you can continue to create the Launch Template for your Auto Scaling Group.
Creating a Launch Template for Auto Scaling
- Go to EC2 Dashboard: Navigate to Launch Templates and click Create Launch Template.
2.1 Launch Template Name: Enter a name (e.g., WebApp-LT
).
2.2 AMI ID: Go to My AMIs and enter the AMI ID you created earlier (WebApp-webserver-image
).
2.3 Instance Type: Select an instance type (e.g., t2.micro
).
2.4 Key Pair: Choose your existing key pair for SSH access (webapp-key-pair
).
2.5 Security Groups: Attach the security group created earlier for your EC2 instances (WebApp-Security-Group
).
2.6 Auto-assign Public IP: Under Advanced Network Configurations
, enable public IP assignment.
2.7 User Data: Add the user data for the instances that will be created by the auto scaling group. In this demo we are adding the same user data as our existing webserver.
2.8 Create Template: Click Create Launch Template.
This template will serve as the blueprint for launching instances in your Auto Scaling Group.
Step 2: Configuring Auto Scaling Groups (ASG)
Next, we’ll set up the Auto Scaling Group using our launch template to automatically adjust the number of EC2 instances based on CPU utilization, ensuring the application can handle traffic surges.
1. Navigate to Auto Scaling
- In the AWS Management Console, go to EC2 Dashboard, and on the left-hand menu, under Auto Scaling, click on Auto Scaling Groups.
2. Create Auto Scaling Group
- Click on Create Auto Scaling Group.
3. Configure Auto Scaling Group Settings
-
Auto Scaling Group Name: Enter a name for the group (e.g.,
WebApp-ASG
).
4. Configure Network Settings
-
VPC: Select your VPC (
WebAppVPC
). -
Subnets: Choose the three public subnets you created (
10.0.10.0/24
,10.0.20.0/24
, and10.0.30.0/24
).
5. Attach the Load Balancer
- Select Attach to an existing load balancer.
- Choose the Application Load Balancer created earlier (
WebApp-ALB
). - Select the Target Group (
WebApp-TG
) created for the ALB. - Turn on Elastic Load Balancing health checks.
- Under Additional settings Enable default instance warmup of
60s
clickNext
to continue.
6. Configure Instance Numbers
-
Desired Capacity: Set to
1
(this means there will always be at least 1 instances running). -
Minimum Capacity: Set to
1
(this means the ASG will never terminate the last running instance). -
Maximum Capacity: Set to
4
(this allows Auto Scaling to scale up to 4 instances if needed).
7. Configure Scaling Policies
Now we’ll configure the scaling policies that will trigger instance creation or termination based on CPU utilization.
- Select Target Tracking Scaling Policy.
-
Policy Type: Choose
Target tracking scaling policy
. - Metric type: Average CPU utilization
-
Target Value: Enter
25%
(this means new instances will be launched when average CPU utilization exceeds 25%, and instances will be terminated when it's below 25%). Leave any other setting at default and continue.
-
Policy Type: Choose
8. Review and Create
- Review the configuration, then click Create Auto Scaling Group.
The ASG will now automatically adjust the number of instances based on CPU utilization and distribute traffic across them using the ALB. We can evaluate this by testing the auto scaling using CPU stress.
Section 4: Testing Auto Scaling
To verify that your Auto Scaling setup is functioning correctly, you can simulate high traffic or load on the instances to trigger the creation of new instances.
Step 1: Connect to an EC2 Instance
-
SSH into Your EC2 Instance using EC2 Connect or the AWS CLI with the following command:
ssh -i /path/to/your-key.pem ec2-user@<Public-IP-of-EC2-Instance>
Step 2: Install the Stress Tool
-
Once connected, install the
stress
tool to simulate high CPU load:
sudo yum install -y stress
Step 3: Simulate High CPU Load
-
Run the
stress
command to simulate high CPU load for a set period (e.g., 100 seconds):
stress -c 1 -i 1 -m 1 --vm-bytes 128M -t 100s
This command generates CPU load on one core (since t2.micro
has just 1 vCPU) for 100 seconds. As a result, the CPU utilization should exceed the 25% threshold set in the Auto Scaling Group (ASG), triggering the ASG to launch new instances to handle the increased load.
Step 4: Verify Auto Scaling
In this section, we will verify that Auto Scaling is functioning as expected by simulating a CPU load, monitoring the Auto Scaling Group (ASG) activity, and checking the Load Balancer's traffic distribution across instances.
1. Monitor Auto Scaling Group Activity
Once you've triggered a high CPU load using the stress
tool, Auto Scaling should kick in when the CPU utilization exceeds the threshold of 25%. You can monitor the ASG activity to confirm that new instances are being launched.
- Navigate to Auto Scaling Groups in the AWS Management Console.
- Select your Auto Scaling Group (
WebApp-ASG
) and go to the Activity tab. - You should observe scaling activities showing that new instances are being created because the CPU utilization exceeded the threshold.
This activity confirms that Auto Scaling is working as intended, scaling the number of instances up in response to increased CPU usage.
2. Monitor Instance CPU Usage
Next, we can validate the specific CPU spike caused by the stress
tool. The CPU usage of the instance will show a spike above 25%, which is what triggered the Auto Scaling.
- Navigate to the EC2 Dashboard and select the instance running the stress test.
- Go to the Monitoring tab and view the CPU utilization metrics.
- You should observe a CPU spike to 40.9% during the stress test, which caused Auto Scaling to add additional instances.
This shows that the instance was overloaded, which triggered the ASG to create additional instances.
3. Verify Load Balancer Traffic Distribution
Once multiple instances are running in the ASG, the Load Balancer should evenly distribute traffic across them. You can confirm this by checking the IP addresses the Load Balancer directs traffic to.
- Open a Web Browser and navigate to the DNS name of your Application Load Balancer (ALB).
- Refresh the page multiple times and observe the IP addresses assigned to the different EC2 instances.
- You should see the IP address change between two or more instances, confirming that the Load Balancer is properly distributing traffic across multiple servers in the target group.
This demonstrates that the Load Balancer is effectively balancing incoming requests across the instances, ensuring high availability.
4. Terminate Instances and Verify Auto Scaling
To further test the ASG's functionality, you can terminate or stop all running instances and observe how the ASG automatically replaces them to maintain the desired number of instances.
- Navigate to the EC2 Dashboard and select all instances.
- Terminate or stop all the instances.
Once terminated, the Auto Scaling Group will detect that no instances are running and automatically launch a new one to maintain the minimum instance count.
- Go back to the Auto Scaling Group Activity in the EC2 dashboard to confirm that the ASG is performing checks and launching new instances.
This confirms that the ASG is functioning correctly by ensuring that there is always at least one instance running, even if instances are terminated or fail.
This concludes the verification of your Auto Scaling Group, Load Balancer, and EC2 instances. You've successfully demonstrated the dynamic scaling of instances based on CPU usage, proper traffic distribution across multiple instances, and the resilience of your setup in automatically recovering from instance terminations.
Conclusion
In this tutorial, you've successfully set up a highly available and scalable web application architecture on AWS. By creating a Virtual Private Cloud (VPC) with three public subnets across different availability zones, deploying EC2 instances running an Apache web server, and configuring an Application Load Balancer (ALB) and Auto Scaling Group (ASG), you've ensured that your application can dynamically scale to handle varying traffic loads while optimizing costs and maintaining performance.
Key Takeaways
- High Availability: Distributing resources across multiple availability zones ensures that your application remains available even if one zone experiences failures.
- Scalability: Auto Scaling automatically adjusts the number of EC2 instances based on demand, ensuring your application can handle traffic spikes and scale down during low usage periods.
- Load Balancing: The ALB efficiently distributes incoming traffic across multiple instances, preventing any single instance from becoming a bottleneck.
- Cost Optimization: By only using the necessary resources, you optimize costs while maintaining performance and availability.
Important: Clean Up Your Environment
To avoid incurring unnecessary charges, it's essential to clean up the resources you created for this tutorial once you've completed the setup and testing. Follow these steps to terminate resources:
-
Terminate EC2 Instances:
- Navigate to the EC2 Dashboard.
- Select all instances created for this tutorial.
- Click Actions > Instance State > Terminate.
-
Delete Auto Scaling Group:
- Navigate to Auto Scaling Groups in the EC2 Dashboard.
- Select the ASG (
WebApp-ASG
). - Click Delete.
-
Delete Load Balancer and Target Group:
- Navigate to Load Balancers.
- Select the ALB (
WebApp-ALB
). - Click Actions > Delete.
- Navigate to Target Groups.
- Select the Target Group (
WebApp-TG
). - Click Actions > Delete.
-
Delete VPC:
- Navigate to the VPC Dashboard.
- Select the VPC (
WebAppVPC
). - Ensure all associated resources (subnets, gateways, route tables) are deleted.
- Click Actions > Delete VPC.
By cleaning up your environment, you ensure that you won't be billed for resources you no longer need, making your learning experience both effective and cost-efficient.
This completes the setup of an Auto Scaling Group with an Application Load Balancer to dynamically handle traffic spikes for a web application.
Resources
Here are some helpful resources for further reading on the topics covered in this post:
Top comments (0)