DEV Community

Cover image for Investigating Cloud Cost Anomalies for Kubernetes The Easy Way
CAST AI
CAST AI

Posted on • Originally published at cast.ai

Investigating Cloud Cost Anomalies for Kubernetes The Easy Way

You lead an engineering team and just received a bill from the FinOps manager asking why the cloud services your team uses cost so much. What ended up costing you more than expected? If you use Kubernetes, here’s how you can find any cloud cost anomalies in three simple steps.

Step 1: Analyze your cloud bill for the last month

Take a look at your cloud bill from the last month, and you’ll instantly see why it’s so hard to understand. Cloud providers charge on the basis of various service metrics. For example, some resources in the AWS Simple Storage Service charge by the number of requests, while others use GB.

To make sense of your usage and costs, you must look into various areas in your provider console. You can then group and report on costs by certain attributes - for example, group resources by region or service. 

However, a manual cloud bill analysis is time-consuming and relies on a lot of manual work. Now imagine that you’re managing more than one team using the same cloud service - you’ll have to repeat this analysis for every team!

Or you can use a third-party cost monitoring solution that gives you all the insights you need in one place.

Step 2: Check your daily cloud expenses to identify any spikes 

Take a look at a daily cost report like this one outlining how much your team spent each day:

A single glance may be enough to identify outliers or cost spikes in your usage or expenses.

Having this report handy every day of the month also helps to measure your burn rate and check whether your current spend is compatible with your monthly budget by extrapolating your daily expenses into a monthly bill.

Step 3: Check historical allocation data for cloud cost anomalies

At this point, you might have noticed that your costs were running unusually high for a few days. It’s time to investigate the culprit. This is where historical cost allocation comes in.

This report is your point of departure for asking the following questions and checking specific cloud cost metrics

1. Total cluster cost report - What is your projected monthly spend compared to last month's spend? What is the difference between this and previous month?

2. Allocation by workload - Are there any idle workloads that aren’t doing anything apart from burning your money?

3. Allocation by namespace - What was the distribution between the namespaces in terms of dollar spend?

By the end of this process, you’ll arrive at the answer. You’ll know what happened last month that drove your costs up, whether it’s a service left running over the weekend or a specific team that picked a particularly pricy virtual machine.

Investigating cloud cost anomalies doesn’t have to take hours or days

In a recent survey, engineers said that cloud cost issues caused disruptions to their work that last from a few hours per week (41%) to an entire sprint or more (11%).1

But investigating a cloud cost issue doesn’t have to take hours or days. If you have access to all the reports I mentioned, you can keep your engineers productive and happy by not constantly being distracted by cost problems.

The CAST AI cost monitoring module makes your cloud bill understandable, serving you all the most important cloud cost metrics to make cost analysis quick and easy.

CAST AI clients save an average of 63% on their Kubernetes bills

Top comments (0)