Michael O'Brien

Posted on Apr 18, 2018 • Edited on Dec 31, 2024 • Originally published at sensedeep.com

How to lower AWS cloud costs: a checklist

#aws #webdev #cloud #devops

Originally posted at: https://www.sensedeep.com/blog/posts/stories/checklist-to-lower-aws-cloud-costs.html

For many companies cloud computing is transformational. The advantages are compelling: improved flexibility, increased responsiveness, and let’s not forget, reduced capital expenditure.

However, the ease and speed of creating servers, databases, load balancers and containers in the cloud often leads to a loss of control and increased costs — sometimes with rude sticker shock.

This checklist is a simple set of items to help reduce your cloud bill.

Please comment if you have anything I can add to the list.

Take Control — only run what you need

[ ] It is easy to start things in the cloud and then lose track. Monitor the number of resources you have running by type and track back to the owning team. Use resource tags liberally to categorize. HowTo.

[ ] Periodically audit your resources. Take inventory and check if you need all the resources and services you are running or have created. This includes: instances, RDS databases, ELBs, snapshots, ECS tasks, VPCs, security groups, etc.

[ ] Enable billing alerts at 25%, 50% and 75% of your expected monthly budget. That way you’ll quickly be alerted when something gets out of control. HowTo.

[ ] Run AWS Trusted Advisor regularly (perhaps quarterly) for excess capacity and security issues. HowTo.

Pick the right region

[ ] AWS prices vary considerably across regions. For example: on-demand M4.large is $73/month in us-east-1 and $91/month in ap-southeast-2. Choose the cheapest region that is closest to your customers. HowTo.

Choose the right instance type

[ ] Choose and re-evaluate the instance type for each application. Instance types vary in price by orders of magnitude. Choose carefully. Monitor your application performance by CPU, memory and disk to spot excess capacity and the opportunity to downsize the instance type. HowTo.

[ ] Migrate the newer instance types. AWS sometimes encourages movement to newer instances types by price. For example: M5.large is $70 in us-east-1 whereas M4.large is $73 in the same region.

Use reserved instances for base production capacity

[ ] Your unvarying production base capacity should be on reserved instances. Pre-pay if possible to lock in the lowest price. Check your bill to make sure you are using all your purchased reserved instance capacity. HowTo.

Use spot instances

[ ] Spot instances are usually the cheapest instances available and can be up to 90% less than the on-demand price. But spots are ephemeral. Use spot instances for variable, non-base capacity. Spot pricing is cheapest after hours and on weekends in most regions. Be prepared for AWS to reclaim all your spots. HowTo, HowTo.

[ ] Consider an automatic spot replacement service that will transparently convert on-demand instances to spot (autospotting). This is especially useful in AutoScale groups.

Power down idle resources

[ ] Power down all idle resources. Evaluate when your dev, test, qa and staging environments are not required. You can save up to 70% off your DevOps bill via this step alone.

[ ] Power down unused ELBs. Use Terraform to destroy and re-create as required.

Scale up and Scale Down

[ ] Scale your AutoScaleGroups and database replicas based on load. Consider scaling up if CPU is greater than 60% for 5 minutes and scale down if less than 30% for 20 minutes. HowTo, HowTo.

Aggregate ELBs

[ ] ELBs are expensive especially if you use one ELB per mico-service. With the newer AWS ALB service, you can share a single ALB over multiple services by using different target rules. It works with TLS too via multiple certificates. HowTo.

Reduce network traffic

[ ] Reduce the network traffic from your instances by caching static content at the edge. Consider CloudFront, CloudFlare and other edge cache services.

Prune storage

[ ] S3 storage can grow over time to be a significant cost. Have policies to regularly examine unwanted S3 storage. Do similarly for orphaned EBS snapshots and detached EBS volumes.

[ ] Migrate rarely accessed S3 data to AWS Glacier. Retrieval is slower, but much less expensive. HowTo.

[ ] Set an expiry limit for all CloudWatch logs. The default is to never expire. HowTo.

Know when to leave

[ ] Know when to leave the cloud. At large consistent scale, on-premises hosting may be preferable to cloud hosting. Dropbox and Stack Overflow migrated from the cloud for this reason.

About EmbedThis

We've used these techniques to radically lower our cloud costs when creating our EmbedThis Ioto IoT middleware.

Top comments (6)

Thomas H Jones II • May 9 '18

With respect to

Migrate rarely accessed S3 data to AWS Glacier. Retrieval is slower, but much less expensive

One of my gripes with AWS is that they have a very nice array of storage-tiers, but they're mostly inflexible when it comes to creating progressive lifecycle policies. Unlike my previous life working in the storage and backups world, I usually end up with single-level lifecycles: policies that go straight from Standard to Glacier. While intervening IA or RR/SZ would be attractive for a more-complete lifecycle process, they're not usable until 30 days have elapsed. Most of the data I want/need to lifecycle has greatly reduced value to my customers — beyond compliance with local policies or legislative prescriptions &mdash after the 7-14 day threshold. However, I can either go straight to Glacier at that point and hope no one has an urgent restore-need, or I can keep the data at the full-cost Standard tier until the intermediate tiers become available. Sub-optimal.

...And after-the-fact tiering of non-current data is somewhere in the neighborhood of "hatefully slow". While the s3api tool makes it doable, it's kludgey on top of that slowness. Always a joy when a customer you've gone hands-off with comes panicking to you with not-designed-for S3 sticker-shock. "You've got tens of millions of unanticipated objects in a bucket that's accruing TiBs worth of unexpected storage costs? Alright, lemme add new bucket policies for you while this script runs to force-migrate your stuff to a less-dear tier". Said customers tend to virtual foot-tap while the job runs.

One of the groups I work with does cloud-enablement for our customers. Part of that is cost-control measures. So, quite familiar with the rest of the points you make. When the group first took on this role, one of the earliest tools we wrote was a service to read instance-tags for scheduling of power off/on (and execute an notify of same). Amazing the difference it makes - especially for dev environments.

We'd probably do a lot more in the way of automating some of the cost-control tools/methods ...except AWS hasn't seen fit to make Lambda (and other tools that can be leveraged for automated cost-control) to all the regions our customers occupy. :(

Michael O'Brien • May 13 '18

Really good extra background. Thank you for your insights.

Gil Blinov • Apr 24 '19

This is mostly for companies and start ups: work with a reseller partner.
Preferably one which does not sell professional services.

This was a life changer in my previous company and position (DevOps Tech Lead).
The amount of effort and pro-activeness on the part of DoiT International was astounding.

Disclosure: DoiT changed my mind about reseller partners so much I joined them three months ago. I really do feel like I can do more good here.

Valentin Berlier • Apr 19 '18

A very useful checklist indeed, thank you!

Michael O'Brien • Apr 19 '18

Thank you -- still learning. I'm sure with serverless there will be an additional set of items.

Kaspar Lavik • Mar 24 '20

Great post. Well-researched with lots of great points. I really liked your thoughts too! Thank you. Read similar article about cloud computing here : heliossolutions.co/cloud-computing/