Originally posted at: https://www.sensedeep.com/blog/posts/stories/lower-cloud-costs-with-spot-instances.html
Lower cloud costs with spots. Not the Tour De France jersey type of spots, I mean AWS EC2 Spots. Learn how to use Spot instances to save up to 80% on your cloud spend.
Compute Spot servers are typically up to 80% cheaper than regular On-Demand servers, but they come with limitations such as termination with little warning and restrictions on how they are used in Auto Scale Groups. So while Spot instances have great promise, they are not as widely used for general computing as they could otherwise be.
Amazon EC2 Spot Instances offer spare compute capacity available in the cloud at steep discounts compared to On-Demand instances, often up to 80%. This allows AWS to sell their excess compute capacity, but with a major restriction — Spot Instances can be terminated when AWS needs the capacity for On-Demand servers with only 2 minutes of warning.
There are also operational restrictions for Spot instances. One of the most prevalent ways to provide scale and high availability is to use AWS Auto Scale groups to organize EC2 instances behind a load balancer. However, with both AWS Spot Fleet and AWS AutoScale groups, you have signficant restrictions.
With AutoScale groups, you can populate the group with Spot instances but you cannot use both Spot and On-Demand instances in a single Auto Scale group. With Spot Fleets you can use On-Demand servers, but you cannot dynamically replace your Spot servers with On-Demand servers if AWS reclaims your Spot instances. There are partial workarounds, but they require extra configuration and complexity. With the standard offerings, using Spot instances comes with significant high availability compromises.
PowerDown helps overcome these restrictions and enables you to use Spot instances for production, development and test workloads. PowerDown allows you to mix Spot and On-Demand instances in a single Auto Scale group without compromising availability.
The PowerDown Spot Optimizer extends the AWS Spot facility by directly managing the lifecycle of Spot instances. Users define a desired number of Spot and On-Demand instances for their Auto Scale groups and PowerDown then monitors the group and ensures the group composition matches the desired state. PowerDown will create Spot instances and migrate workloads from On-Demand instances without downtime or impacting availability. If AWS warns that it will soon reclaim your Spot servers, PowerDown proactively launches replacement On-Demand servers before the Spot servers are terminated.
To configure an Auto Scale group to use Spot instances, select the Auto Scale group to edit from the Cloud Visualizer.
The resource edit page then displays a Spot Optimization panel to define the desired Spot configuration.
You can specify either the desired Spot capacity or the desired On-Demand capacity. If you specify the Spot capacity, On-Demand servers will be used for the remainder up to the Auto Scale Group desired number of instances and vice-versa. You can specify the desired number or percentage of servers. Using a percentage is preferred as it will scale automatically if the total desired number of instances is changed.
Before terminating an instance, it may be necessary to complete (drain) current requests first. The Drain Delay field is used to specify a time in seconds to wait before terminating an instance.
PowerDown schedules provide the ability to power down idle or unused cloud resources according to a dynamic schedule. As resources are billed by the second, powering down idle resources can significantly lower your cloud costs.
When powering down an Auto Scale group, you can specify how many servers to run for the powered up and powered down states. This includes the number or percentage of Spot or On-Demand servers to use. This is useful when you wish to maintain some level of service when "powered down".
PowerDown employs two overriding algorithms:
- Make then Break
The "Failsafe" strategy means that if anything goes wrong when allocating a Spot instance, the recovery path is for the Auto Scale group to continue normal operation by allocating an On-Demand instance. For example: if there are insufficient Spot instances available in the Spot market, the Auto Scale group will simply populate the group with On-Demand instances. When the market recovers and Spot instances are available again, PowerDown will resume migrating workloads onto the Spot instances.
The "Make then Break" strategy means that PowerDown will create (make) a new Spot instance, wait for it to be healthy and add to a load balancer, before detaching and terminating (breaking) the existing On-Demand instance. In this manner, availability and redundancy is fully maintained during the migration.
Using both strategies, PowerDown ensures the desired number and health of instances is maintained. Throughout, you have the confidence that the price you pay for instances will never be more than the On-Demand price and will typically be a steep discount (up to 80%) to the On-Demand price.
In reality, the Spot markets are now quite stable and you often see Spot instances lasting several weeks at a time without termination by AWS.
The Spot optimizer algorithm uses the following steps:
- Check the Auto Scale current number of instances against the desired number.
- If more Spot instances are desired, identify an On-Demand instance to replace.
- Boost the Auto Scale group maximum parameter temporarily if required.
- Spin up a Spot instance using the Launch Template / Configuration and configuration from the On-Demand instance.
- Attach the Spot instance to Auto Scale group.
- Detach the On-Demand instance and wait for current requests to drain.
- Terminate On-Demand instance.
- Readjust the Auto Scale group maximum parameter if required.
- Repeat after a 2 minutes delay.
When AWS needs more servers for On-Demand customers, it may reclaim Spot servers to meet the need. Spot servers will be given a 2 minute warning so they can save state before they are terminated. PowerDown listens for this warning and proactively creates and attaches a replacement On-Demand server so that it will be ready and serving requests before the original Spot server is terminated. PowerDown does this to ensure availability during the transition.
The PowerDown Spot Optimizer enables you to effectively and easily use Spot instances for your production workloads and reduce your EC2 compute costs by up to 80%. Coupled with the ability to schedule powering-down non-production loads when not required, you can achieve big reductions in your cloud spend.