DEV Community

Totalcloud.io
Totalcloud.io

Posted on

The Proven Practices for Successful AWS Cost Optimization

Cost optimization strategies for AWS services are abundant. Prioritizing between your options is necessary to make sure you don’t overload yourself with the wealth of information. Looking at the best practices in the industry right now and the practices that have now become obsolete would help you find stability in your finances.

Cost Visibility and Analysis

Before diving into the nitty-gritty details of the strategies to be employed, you need to monitor the current status of your infrastructure to gain insight into what you need. Accessing billing information and purchase reports enables you to analyze your expenses and directly restrict additional costs. Additional monitoring of your cloud resources, their utilization, accuracy & security can save massive costs. AWS has a few native tools that point out what you need to be doing by looking at where your money is going.

The tools analyze your bill to predict future expenses or give you a detailed report on where your expenses went. Using this information, you can set up budget plans for your reserved instances. Identifying unallocated Elastic IPs or unused objects with these tools also lets you save costs by deleting them.

The three valuable native AWS cost optimization tools are:

  • AWS Cost Explorer
  • AWS Trusted Advisor
  • Cost and Usage Report

Detailed Billing used to be able to provide this service but now it has been disbanded and instead, Cost and Usage Report does it. Detailed Billing had let you group your expenditure by the features, tools or parts of your infrastructure to see where expenses are high and low.

Now that you have an understanding of how to analyze your expenses, it’s redundant if it’s not actionable. Here are the three major changes you can undertake, based on the information you have accessed.

These are the Pillars of AWS Cost Optimization

  • Scheduling
  • Rightsizing
  • Storage optimization

Scheduling

Scheduling is the act of maintaining or limiting the runtime of your resources so that you don’t pay for non-usage. Oftentimes, poor (or non-existing) scheduling strategies are the main culprit for why companies suffer massive losses. How often have there been articles about a server being turned on and left running for months or even years together? Since the pricing model of many AWS services charges you for the duration a service stays up, every moment counts. This makes scheduling an integral element to cost optimization.

The most widely scheduled resource is EC2, but there’s no need to stop at just that. We’ve observed that cost-savings can come from scheduling every other resource as well. Some of them inherently support start/stop functions (like RDS Instances & Redshift Clusters), and some don’t (ECS, EKS, Fargate). But even these can be optimized to save costs.

The go-to metric to create a schedule is time. You’re maintaining uptime of your resources based on business hours or the time you deem necessary for them to be up. In a 168-hour week, you're most likely using your resources for only 40 hours (a typical 8-hour workday x 5). Shutting your resources down the remaining 76% of the time can be the difference between an optimized, cost-effective workload and one that gives you dreadful bill shocks.

A more automated & quick method is usage-based scheduling, where resources shut down on the basis of idleness. You can set up your system to detect idle resources and park them in real-time. This intricately optimizes resource usage, to save you 2x more costs than the usual time-based scheduling.

Now, you can automate the scheduling of resources with CloudWatch events or a third-party AWS scheduler. While CloudWatch is an AWS exclusive option that offers flexibility, it requires customers to write their own code to execute their automated tasks. A third-party AWS Scheduler would be able to offer similar functionalities but with a more user-friendly approach.

Rightsizing

Rightsizing achieves the lowest possible cost of maintaining your instances by choosing the appropriate instance types and size according to your performance and hardware requirements.
Rightsizing is the strategy to adopt to avoid spending excess on the hardware you don’t need.

Purchasing the instances you need won’t end the rightsizing process. Keeping a periodical monitoring(monthly recommended) of the resources you have purchased is necessary to plan ahead on future purchases and to break down what works and what doesn’t.

The 4 factors to monitor that decide what the right amount of instances to purchase are

  • vCPU utilization
  • Memory utilization
  • Network I/O utilization
  • Disk utilization

Purchase Behaviour

On-Demand Instances
On-Demand Instances are the most commonly purchased instance types, mainly due to how much control the customer is given with it. For long-time users, looking into the updated pricing list can be beneficial to their AWS cost optimization. It is likely they are running instances that are currently more expensive than their alternatives. Instances such as m1,c1, and t1 can now be switched to m3,t2, and c3. This migration comes with superior CPU performance, higher memory, and a lower price. Stagnating from updating your instances could cost you 10-20% higher.

Reserved Instances

Purchasing the right AWS reserved instances can save costs up to 60% of your current expenditure. Businesses that continuously make use of the AWS cloud environment should opt to evaluate their Reserved Instances every month. Reserving an instance for one year or three-year duration and usage parameters can fetch an hourly rate lower than On-Demand pricing for up to 75%.

AWS Reserved instances for DynamoDB charges based on throughput instead of running hours. That is, whether the instance is running or not, the reservations continue to charge.

Spot Instances

The last instance type to pay attention to for cost optimization is Spot Instance. Spots are spare Amazon EC2 instances that can be purchased at prices less than 90% of its original. The spot instance is terminated if its price exceeds the customer’s stated price or the capacity ends up unavailable.

Using the pricing history on the AWS console makes finding the right Spot Instance easy. Spot Instances are best used along with workloads that don’t suffer from interruption and can be replaced with On-Demand instances without backups and data restoration.

Deploying a tool like Spot Instance Advisor gives you the benefit of being prepared to purchase the right Spot instance with the least amount of interruptions and fair pricing. Spot Fleet will further help you by having multiple spot instances ready to be deployed according to their value in the case of an interruption.

Storage Optimization

You are likely spending a lot of money for storage space that you are not utilizing at all or utilizing improperly. The main goal of cutting storage costs is ensuring your services remain functional at optimal conditions. To ensure this, you have to balance your use of Amazon S3 storage tiers and other AWS storage services properly.
When evaluating storage requirements, customers should segment data by how available and durable it needs to be, the size of data sets, throughput and IOPS thresholds, and regulatory requirements.

Amazon S3 Storage Lifecycle Optimization
The ideal way to optimize your storage is to set an S3 storage lifecycle. This will make the optimization automatic. AWS S3 has several storage classes where you move your resources according to the purpose of each class. The various pricing models they offer can effectively cut down your costs as long as you handle distribution wisely.
Amazon S3 storage tiers include:

S3 Tiers

Deleting unused disk volumes open up large spaces for backups and moving your data more freely. Keep tabs on your storage allocation and you can save up a ton.

Structural Cost Optimization

You can also optimize your costs by targeting the expenses made by the individual members or teams of your organization. Miscommunication can lead to poor resource handling which causes excess purchases that you might not need.

Complex Infrastructure

When your business is large and has a complex infrastructure that spans across the globe, effective communication between overseas teams becomes difficult to achieve. The various departments of an organization and their individual technical specialties are a huge factor in said miscommunication. There are usually multiple teams, one that sets up the infrastructure and another that actually operates it; and the former team has control over their runtime. If there is any miscommunication between the deploying team & the operating team, it can lead to unforeseen charges.
Setting up policies and guidelines for your various teams to follow aligns all the employees to common practice.
It could remove potential communication errors by encouraging the teams to open educated conversations with peers of other specialties.

The Human Element

The human element of running a business makes it prone to occasional errors. Forgetting to shut down a resource, deleting unused volumes or using up more expensive resources are all common careless mistakes. The best solution is to balance the manual workload with the automated workload. A lot of the strategies can be implemented using automation tools. Taking some of the burdens of your employees can lead to lower chances of errors.

How Cost Optimization can help other aspects of your business

Detecting Rogue Infrastructure

If you’ve got infrastructures that fulfill purposes that the organization has not assigned it to do, then it has gone rogue. Depending on the complexity of this unauthorized activity, your infrastructure might be something that is racking up some excess expense. Analyzing your bills with cost management tools can help you pinpoint what infrastructure is acting up.

Security holes

Analyzing your bills could help determine any potential vulnerabilities. Unintended spikes in prices could be because of a potential breach in security.

Neglected projects

You can identify half-finished projects that continue to charge you but aren’t being worked on. Idle projects can either be discontinued or you can strip its resources off temporarily.

With the right strategies for your resources and the right tools to assist you in actualizing these practices, you are on your way to stabilize the finances for your cloud management.

Top comments (0)