DEV Community

Doug Sillars for unSkript

Posted on • Originally published at unskript.com on

Keeping your Cloud Costs in Check: Automated AWS Cost Charts and Alerting

Building and deploying infrastructure in the cloud is (by design) a very simple process. If a team is not being careful in their deployments, it can also become an expensive process. The interplay between finance teams and cloud teams has led to a new job function – FinOps. What is FinOps? This team (or team member) works with developer teams to understand cloud needs, negotiates better prices with cloud operators, and can help translate cloud expenses and needs to the finance team.

However, many companies working in the could don’t have the luxury of allocating a team member to understanding cloud costs, and it is left to the dev teams to do their best to mitigate surprise bill. Without a full-time FInOps professional, teams will need tooling to help them better understand and control their cloud bills.

The FinOps Institute has defined 6 domains for a FinOps tram:

  • Understanding Cloud Usage and Cost
  • Performance Tracking and Benchmarking
  • Real Time Decision Making
  • Cloud Rate Optimization
  • Cloud Usage Optimization
  • Organizational Alignment

In this post, we’ll describe a series of automated RunBooks that check the first three boxes (and help inform several others). What we’ll do is build automated reporting around the AWS Cost and Usage Report (CUR).

The AWS CUR

AWS’ Cost and Usage Report is a SQL report that breaks down your AWS spend in many different ways, and at a selected interval (hourly, daily, monthly). When generating a CUR report, a SQL file is placed into a S3 bucket at regular intervals. To set up a CUR report for your AWS Account, the AWS Documentation has a very nice tutorial. In this post, our CUR is updated daily, but for larger projects, you may want hourly granularity.

We set our CUR report to be sent into AWS Redshift. However, the daily updates are only added to the S3 file. To update our Redshift table, we must regularly take the file in SQL and update the database in Redshift.

We accomplish this by building a RunBook using a few new Actions:

RunBook to update table in Redshift

When Making a RedShift Query, we need to know the Secret Manager ARN, the AWS Region, and the Redshift Cluster/database.

To get the Secret Arn, we use the AWS GET Secrets Manager ARNAction. This takes a secret name, and provides the ARN. (This does require Secrets Manager permission in your IAM credential.)

We then Create two SQL queries programatically. The table in Redshift is named awsbilling202303 (since it is currently March 2023). To ensure that the table name is always correct, we generate the query programmatically, so that the table name always has the format awsbilling<year><month>.

Next, we perform two SQL commands:

First, we TRUNCATE the table. This removes all of the rows, but keeps the columns:

truncate table awsbilling202303

Next, we COPY the rows from the SQL table in AWS:

copy awsbilling202303 from 's3://unskript-billing-doug/all/unskript-billing-doug/20230301-20230401/unskript-billing-doug-RedshiftManifest.json' credentials 'aws_iam_role=arn:aws:iam::<arn name>' region 'us-west-2' GZIP CSV IGNOREHEADER 1 TIMEFORMAT 'auto' manifest;

This query is provided in your S3 bucket.

With the RunBook created, we can schedule this RunBook to run daily, ensuring that the AWS table is always up to date.

NOTE: I am not a database expert. This is the “I have a hammer, so everything must be a nail” approach to updating the database. There is probably a more nuanced query that could be run.

Building Charts and Alerts

Now that the data is being populated into Redshift daily, we can begin exploring our data. In this second RunBook, we are going to extract the daily spent for each AWS service, plot the data, and create an alert for large changes in costs.

Our new RunBook begins the same way as our first RunBook – generating a SQL query and executing it at RedShift. This time we are querying Redshift for usage costs for every AWS product:

select lineitem_productcode, 
        date_part(day, cast(lineitem_usagestartdate as date)) as day, 
        SUM((lineitem_unblendedcost)::numeric(37,4)) as cost 
from awsbilling202303 
group by lineitem_productcode, day 
order by cost desc;

This query is then placed into a dataframe. Using this data, we can create a chart of our daily spend by AWS product:

AWS Product Cost by day

Here is the same chart – just looking at the last seven days:

AWS costs over the last 7 days

If we look back at the FinOps Institute bullet list, we are beginning the FinOps Institute bullets: Understanding Cloud Usage and Cost, as well as Tracking our Performance.

If we automate this RunBook to run daily (after the table is updated), we can generate this regular plot of our AWS Cloud Spend. In general, a daily chart with no change is not of great interest. But – if we can build an alert around our Cloud usage that finds significant increases in day to day cost – we can attach this chart to make the cost jump easy to identify.

Building an Alert

Every organization will have different thresholds for alerting. In the following code, we examine the 2 days previous (March 18 and 19) and look for increases of over 5%. Since many of the services in this chart are at very low spend rates, we add the additional filter that the change must be over $1. Looping over each service with the following:

if abs(todayCost-yesterdayCost) >1: 
   if delta >.05:
       #print( instance, delta,dfpivot.at[today, instance], dfpivot.at[yesterday, instance])
       bigchange[instance] = {"delta":delta, "todayCost":todayCost,"yesterdayCost":yesterdayCost}
       alertText = '@here There has been a large change in AWS Costs'
       alert = True
if date.today().weekday() == 0:
   alertText = 'Today is Monday, Here is the last week of AWS Costs'
   alert = True

If any of the changes in cost trigger this alert, we send an image on Slack with a ‘@here, there’s been a large change in AWS Spending” message. We also send the chart every day on Monday., so that there is a visual history of AWS spending.

Slack message with chart of AWS spending

When beginning to study and alert on your Cloud Spending it is important to start with simple reports and altering. This use case of daily spend by product, is a great start into our journey into the FinOps Institute’s next bullet points: Real Time Monitoring and Cloud Usage Optimization.

Conclusion

Many companies feel the monthly dread of “how big will my Cloud bill be this month?” By charting and alerting on your daily spend across all Cloud products, your team is less likely to be surprised by the beill at the end of the month. This data can also be used mitigate large changes that often result in bill “surprises.”

Using unSkript and the AWS Cost and Usage report, you can begin (or continue) your FinOps journey by better understanding where your costs are coming from and how they are changing day to day. Watch our blog for more posts on Cloud CostOps and how you can monitor your AWS bill.

Top comments (0)