Yevhen Bondar for Daiquiri Team

Posted on Aug 23, 2022

Deploying Django Application on AWS with Terraform. ECS Autoscaling

#aws #terraform #autoscaling #ecs

This is the 7th part of the "Deploying Django Application on AWS with Terraform" guide. You can check out the previous steps here:

In this part, we'll make our Django web application scalable using ECS Autoscaling.

Autoscaling is the ability to increase or decrease the number of running instances. It allows you to handle traffic spikes and save money for low intensive periods of time.

When you enable autoscaling for ECS service, AWS creates Cloudwatch alarms to determine whether we need to add a new instance or remove a redundant one.

Let's see how it works in practice.

ECS Autoscaling configuration

First, create a new autoscale.tf with the following content:

resource "aws_appautoscaling_target" "prod_backend_web" {
  max_capacity       = 5
  min_capacity       = 1
  resource_id        = "service/${aws_ecs_cluster.prod.name}/${aws_ecs_service.prod_backend_web.name}"
  scalable_dimension = "ecs:service:DesiredCount"
  service_namespace  = "ecs"
}

resource "aws_appautoscaling_policy" "prod_backend_web_cpu" {
  name               = "prod-backend-web-cpu"
  policy_type        = "TargetTrackingScaling"
  resource_id        = aws_appautoscaling_target.prod_backend_web.resource_id
  scalable_dimension = aws_appautoscaling_target.prod_backend_web.scalable_dimension
  service_namespace  = aws_appautoscaling_target.prod_backend_web.service_namespace

  target_tracking_scaling_policy_configuration {
    predefined_metric_specification {
      predefined_metric_type = "ECSServiceAverageCPUUtilization"
    }
    target_value = 80
  }

  depends_on = [aws_appautoscaling_target.prod_backend_web]
}

Here we defined:

AWS AppAutoscaling Target. We want to scale the prod_backend_web service by instance count dimension from 1 to 5.
AWS AppAutoscaling Policy. We want to scale prod_backend_web up when the ECSServiceAverageCPUUtilization metric exceeds 80%.

We are ready to apply changes, but first, let's think about load balancer health checks.

Load Balancer Health Checks

Now we have a quire aggressive health checks. If the container fails to respond twice with a timeout of 2 seconds, Load Balancer considers this container unhealthy and removes them.

It could be an okay solution for the little amount of traffic. But if many requests reach the container and CPU usage goes up to 100%, the container will fail to respond to Health Checks. So, Load Balancer kills them, and we will face an even worse situation: there will be no containers to handle the traffic at all

The possible solution is to increase health checks timeout and unhealthy_threshold. Thus, there will be more possibility for overloaded containers to survive.

I think it's not a perfect solution, but it will work for this test. If you know a more elegant way to keep overloaded containers running, feel free to leave a comment.

Go to the load_balancer.tf and increase unhealthy_threshold, timeout, and interval parameters.

# Target group for backend web application
resource "aws_lb_target_group" "prod_backend" {
  ...

  health_check {
    ...
    unhealthy_threshold = 5
    timeout             = 29
    interval            = 30
    ...
  }
}

Let's apply our changes and check them at the AWS console.

CloudWatch Alarms

First, go to the ECS console and check the autoscaling policy for the prod_backend_web ECS Service. Select prod ECS cluster, select prod-backend-web service and click "Update". Pass to the step "Set Auto Scaling" and click on the prod-backend-web-cpu autoscaling policy.

Here we see that autoscaling becomes effective when average CPU utilization reaches 80%. But what is the condition for scaling down? Let's check CloudWatch alarms associated with this autoscaling policy.

Go to the CloudWatch console and look at the alarms.

Here we see that we scale up when the average CPU load exceeds 80% during 3+ minutes. And, we scale down when the average CPU load goes less than 72% for 15 minutes.

Such specific numbers, but how can we adjust them to our case? For this, you need to create and use custom metrics for alarms with customized_metric_specification param in aws_appautoscaling_policy.

Also, you can change AlarmHigh and AlarmLow metrics manually in the console. It's not a preferable way to create a repeatable setup, but it's okay for our test. So, I'll change the AlarmLow metric to 50% and 10 minutes.

Stress Testing

Let's move to the tests. I'll use ApacheBenchmark for stress testing. This tool can send a lot of requests to our service, so the CPU load goes up.

First, ensure that now the web service has only one container running.

Also, you need to increase the limit of open files with ulimit -n 10000.

Now we are ready to run the benchmark. We'll use the health-check URL for this test:

$ ab -n 100000 -c 1000 https://api.example53.xyz/health/

Where -c 1000 concurrent number of requests, -n 100000 is the total number of requests.

Check the CloudWatch metrics and ECS Service for the next 10-15 minutes.

First, you should see the CPU spike in the charts. After 3 minutes, ECS autoscale starts to spawn new instances.

Then, the average CPU drops below 80%. There were 3 ECS tasks at this moment of time.

After some time, CPU load exceeds 80% again, and ECS autoscale creates the 4th instance. You can see them on the ECS console.

So, scale up works; let's check scale down. Stop ApacheBenchmark and wait for 10-15 minutes to wait for scale down.

You'll see how CPU load drops to zero and ECS scales down the web service to 1 instance.

Recheck the ECS console to ensure that we have only one web task running:

So, scale down works too. Let's commit and push our changes to the infrastructure repository.

The end

Congratulations! In this part, we added ECS autoscaling for the web service. We increase Health Check timeout and period to prevent killing overloaded containers. Then, we run a stress test and verify that number of instances increases when CPU load goes up and decreases when CPU load goes down.

You can find the source code of backend and infrastructure projects here and here.

If you need technical consulting on your project, check out our website or connect with me directly on LinkedIn.

DEV Community

Deploying Django Application on AWS with Terraform. ECS Autoscaling

ECS Autoscaling configuration

Load Balancer Health Checks

CloudWatch Alarms

Stress Testing

The end

Top comments (0)

Read next

S3 table & S3 Metadata table

The Human Side of CI/CD: When Technology Meets Teamwork

AWS re:Invent 2024: Key Announcements on Containers and Serverless

Aurora Serverless v2 scales to zero.. but how fast?