I wanted to draw attention to garbage in the cloud.
I believe there is a significant amount of resources built in the cloud that are:
- unpatched / out of date (OS, Framework, TLS/SSL)
- idle / over-provisioned resources
- over-engineered / complex
This is contributing to global environmental shifts as the energy usage of these workloads exceeds the ability of natural processes to safely balance things out.
Coming out of AWS re:invent 2023 I know the most critical part of knowing waste is knowing cost! After all, extra crap in the cloud costs extra money. Follow the money, and you'll likely find the most misconfigured things.
This is a great time to plug Dr. Werner Vogels' small but powerful "book" on The Frugal Architect thefrugalarchitect.com
I highly encourage you to read it, it’s only 7 small but extremely vital points!
Most systems are comprised of smaller components, so the following tricks can have a great impact:
Tag everything (come up with a tagging standard that can be used across all resource types that provides the greatest value to the business)
Use Resource Groups (Using resource groups can be a fantastic way to group related resources for cost analysis as well as application monitoring using AWS DevOps Guru)
Standardize Compute Workload Strategy (Work to provide your developers with a few solid pathways to having code run in the cloud efficiently. Ex: an EC2 pathway, a container pathway, a serverless pathway, etc)
The AWS Cost Explorer lets you visualize, understand, and manage AWS costs and usage over time. It provides detailed reports that show cost trends for various dimensions such as service, linked account, tag, etc.
Key features of the AWS Cost Explorer are:
Forecasting (It can predict your likely spending over the next three months based on your historical cost data.)
Customizable filters and views (You can filter and group your cost data by various dimensions to focus on the areas that matter most to you.)
Reserved Instance (RI) Reports (These reports help you manage your reserved instances by showing your usage, modifications, and exchanges. This can help when planning adjustments to your prepaid EC2 compute resources.)
By using AWS Cost Explorer, you can identify trends, pinpoint cost drivers, and detect cost anomalies. This understanding lets you control your costs more effectively.
For example, you might find that you're spending more on a particular service than you expected, and you can then investigate whether this cost is necessary or if there are ways to reduce it.
Some of the key services we were overspending on were:
For KMS 🔑,
Each key is ~$1/mo, and with CDK, keys are generated on a massive scale, if not centralized.
We decided to optimize and centralize some of our keys, having a "data key" for example for all data in an env/region. Which is imported and used by CDK projects to reduce the per-project key cost.
For S3 🪣,
We noticed larger expenses, and decided to reduce the storage class on data we knew we weren't fetching often.
A good example of this was our Datadog Log Archive, which is what we use to re-hydrate older logs from S3 to Datadog to troubleshoot issues which occurred in the past.
We lowered the storage class on this and immediately started saving on cost with no noticeable impact on the user's performing re-hydration of logs.
For CloudWatch 📄,
This one has always been insane on pricing to me. CloudWatch Logs are very expensive...
There's only 1 answer to this... Don't Log Everything! Set a proper log level, and don't adjust it above ERROR unless you're performing true DEBUG work.
We found hundreds of services on the INFO/DEBUG level which was an insane cost increase that wasn't needed. We're working to adjust them all down to ERROR, and only ramp that up on a case-by-case basis.
In conclusion, managing cloud resources efficiently is not just a matter of cost, but also a matter of environmental responsibility! 🌲🕊️
By identifying and eliminating waste, we as a society can reduce our expenses and in turn, our carbon footprint.
Tools like AWS Cost Explorer, along with good practices such as tagging resources, using resource groups, and standardizing compute workload strategies, can help us collectively achieve this goal.
Remember, every bit of optimization counts, not just for your budget, but for the planet as well. Let's strive to be frugal architects of our cloud infrastructure, making the most of every resource we use!
- Logan S. @d4rkd0s