Aws Auto scaling divides into 2 categories: Fleet Management and Dynamic Scaling.
- Replacing unhealthy instances;
- Distributing instances among availability-zones to maximize resilience; E.g: You're running instances in us-east, so auto-scaling can provision instances in the following AZs: us-east-1a, us-east-1b, us-east-1c, us-east-1d, and us-east-1e;
- Scaling based on cloudWatch alarm metrics or a metric type (more on that later) when a threshold is met or different measures should be taken depending on the breach of a cloudWatch alarm threshold.
- Simple scaling: Scales based on a single cloudwatch alarm metric, and apply the measures you define;
- Step scaling: Scales based on different levels of cloud watch alarm metrics, and apply the actions you define;
- Target tracking scaling: Scales based on a metric type, but delegates the action to be taken to AWS;
That's not the right question to ask. Actually you'll be using Fleet Management out-of-the-box, with the possibility of configuring Dynamic Scaling to take some custom actions;
To get to the auto-scaling configuration you should go to the
EC2dashboard and find
Auto Scaling Groupsin the sidebar. Select one auto-scaling group and find in the tabs below the auto-scaling groups listing the one called
Fleet management: An application running on an EC2 instance stops responding health check, then auto-scaling stops routing traffic to it and moves that instance to quarantine to be analyzed, and spins up another instance one to replace it;
Dynamic scaling - Simple scaling: You have a cloudWatch alarm that monitors EC2 instances for cpu utilization and fires an alarm whenever it goes beyond 80% for 300 seconds (5 min). Your simple scaling policy defines the action to be taken is to spin up 1 more instance.
Dynamic scaling - Step scaling:
You have a cloudWatch alarm that monitors EC2 instances for cpu utilization and fires an alarm whenever it goes beyond 50% for 300 seconds (5 min). Your step scaling policy defines the action to be taken is to:
- Spin up 1 more instance when cpu utilization is <= 50% and < 60%;
- Spin up 3 more instances when cpu utilization is <= 60% and < 70%;
- Spin up 5 more instances when cpu utilization is <= 70% and < infinity;
Keep in mind that these instances will add up, so if your cpu utilization goes progressively until 70% you'll end up having 9 EC2 instances;
Target tracking scaling:
You want to keep the cpu utilization of your fleet at 50%, but let AWS handle how many instances should be launched or terminated in order to keep that metric.
Good to know: Aws runs algorithms and defines how to best take actions to scale out/in your EC2 instances based on the receiving demand.
- Application Load Balancer Request Count Per Target;
- Average CPU Utilization;
- Average Network In (Bytes);
- Average Network Out (Bytes);
Let me know if it was all clear in the comments below or with a reaction.