Earlier this year, one of the largest web applications I've ever developed went live. As part of supporting the organisations using the application I'd built an operations dashboard that lets me see all the stats and logs from the application in one place.
The application uses a private S3 bucket to store cached files. The S3 bucket has a life cycle policy to delete files twenty hours after being created. I wanted the dashboard to show the count of items in this bucket and the total size of the bucket to keep track of consumption.
This article explains how I used the Go API (Application Programming Interface) to extract the metrics from CloudWatch
Amazon CloudWatch metrics for Amazon S3 can help you understand and improve the performance of applications that use Amazon S3. There are several ways that you can use CloudWatch with Amazon S3.
Amazon CloudWatch Daily Storage Metrics for Buckets
Monitor bucket storage using CloudWatch, which collects and processes storage data from Amazon S3 into readable, daily metrics. These storage metrics for Amazon S3 are reported once per day and are provided to all customers at no additional cost.
Amazon S3 CloudWatch Daily Storage Metrics for Buckets
The CloudWatch AWS/S3
namespace includes the following daily storage metrics for buckets.
BucketSizeBytes
The amount of data in bytes stored in a bucket in the STANDARD storage class, INTELLIGENT_TIERING storage class, Standard - Infrequent Access (STANDARD_IA) storage class, OneZone - Infrequent Access (ONEZONE_IA), Reduced Redundancy Storage (RRS) class, Deep Archive Storage (S3 Glacier Deep Archive) class or, Glacier (GLACIER) storage class.
This value is calculated by summing the size of all objects in the bucket (both current and non-current objects), including the size of all parts for all incomplete multipart uploads to the bucket.
Valid storage type filters: StandardStorage, IntelligentTieringFAStorage, IntelligentTieringIAStorage, IntelligentTieringAAStorage, IntelligentTieringDAAStorage, StandardIAStorage, StandardIASizeOverhead, StandardIAObjectOverhead, OneZoneIAStorage, OneZoneIASizeOverhead, ReducedRedundancyStorage, GlacierStorage, GlacierStagingStorage, GlacierObjectOverhead, GlacierS3ObjectOverhead, DeepArchiveStorage, DeepArchiveObjectOverhead, DeepArchiveS3ObjectOverhead and, DeepArchiveStagingStorage (see the StorageType dimension) Units: Bytes Valid statistics: Average
NumberOfObjects
The total number of objects stored in a bucket for all storage classes except for the GLACIER storage class. This value is calculated by counting all objects in the bucket (both current and noncurrent objects) and the total number of parts for all incomplete multipart uploads to the bucket. Valid storage type filters: AllStorageTypes (see the StorageType dimension) Units: Count Valid statistics: Average
Using Go to Obtain S3 Bucket Metrics
The first step is to install the AWS software development kit (SDK) for GoLang. This is done by using the following Go get command issued at the terminal or command prompt.
go get github.com/aws/aws-sdk-go/...
Once the AWS SDK has been installed, you'll need to import the relevant sections into your program to interact with CloudWatch.
package main
import (
"fmt"
"log"
"strconv"
"strings"
"time"
"github.com/aws/aws-sdk-go/aws"
"github.com/aws/aws-sdk-go/aws/credentials"
"github.com/aws/aws-sdk-go/aws/session"
"github.com/aws/aws-sdk-go/service/cloudwatch"
)
Within the main part of your program, you need to create an AWS session using the NewSession
function. In the example below the newly created session is assigned to the awsSession
variable. The session object gets created by supplying the AWS region identifier and your AWS credentials for Access Key
and Secret Key
. You can obtain these keys from your AWS account. The NewSession
function returns a pointer to the session object, and if there was an error, i.e. you specified an invalid region, then an error struct is returned.
You should, therefore, check the status of the error variable and handle it appropriately for your use case. In this example, we log the error to the console as a fatal message. Logfatal equivalent to Print() followed by a call to os.Exit(1); so the program will terminate.
var err error
var awsSession *session.Session
func main() {
awsSession, err = session.NewSession(&aws.Config{
Region: aws.String("eu-west-2"),
Credentials: credentials.NewStaticCredentials(
os.Getenv("AccessKeyID"),
os.Getenv("SecretAccessKey"),
""),
})
if err != nil {
log.Fatal(err)
}
}
Once you've got a pointer to a valid AWS session you can reuse that session to make multiple calls against the CloudWatch API.
Obtaining the S3 Bucket Size Metric
The getS3BucketSize
function shown below takes one parameter, which is the bucket name for which we want to get the CloudWatch metrics. The function returns three values. The first value is the average data point which contains the bucket size. Secondly, the date/time that CloudWatch logged the value. Finally, the third parameter returns any errors within the function.
We're requesting the metric for S3 objects stored under the StandardStorage
tier in the example below. If you want the same function to query different storage types, then you'd need to make this an input parameter to the function.
After getting the result from the GetMetricStatistics
function call, we iterate over the result.Datapoints
getting the results to return in the return values.
func getS3BucketSize(bucketName string) (float64, time.Time, error) {
var bucketSize float64
var metricDateTime time.Time
svc := cloudwatch.New(awsSession)
result, err := svc.GetMetricStatistics(&cloudwatch.GetMetricStatisticsInput{
MetricName: aws.String("BucketSizeBytes"),
Namespace: aws.String("AWS/S3"),
StartTime: aws.Time(time.Now().Add(-48 * time.Hour)),
EndTime: aws.Time(time.Now()),
Period: aws.Int64(3600),
Statistics: []*string{aws.String("Average")},
Dimensions: []*cloudwatch.Dimension{
&cloudwatch.Dimension{
Name: aws.String("BucketName"),
Value: aws.String(bucketName),
},
&cloudwatch.Dimension{
Name: aws.String("StorageType"),
Value: aws.String("StandardStorage"),
},
},
})
if err != nil {
return 0, time.Now(), err
}
for _, dataPoint := range result.Datapoints {
bucketSize = *dataPoint.Average
metricDateTime = *dataPoint.Timestamp
}
return bucketSize, metricDateTime, err
}
Obtaining the S3 Bucket Count Metric
To get the count of files/objects stored in the S3 bucket, you'd need to change the MetricName
from BucketSizeBytes
to NumberOfObjects
and change the storage type to AllStorageTypes
as shown in the snippet below.
MetricName: aws.String("NumberOfObjects"),
Namespace: aws.String("AWS/S3"),
StartTime: aws.Time(time.Now().Add(-48 * time.Hour)),
EndTime: aws.Time(time.Now()),
Period: aws.Int64(3600),
Statistics: []*string{aws.String("Average")},
Dimensions: []*cloudwatch.Dimension{
&cloudwatch.Dimension{
Name: aws.String("BucketName"),
Value: aws.String(bucketName),
},
&cloudwatch.Dimension{
Name: aws.String("StorageType"),
Value: aws.String("AllStorageTypes"),
},
},
Conclusion
I'm using these functions within a Linux command-line utility which gets invoked from a cron job daily. The results from the last thirty days are stored in a database so that I can present the values on my dashboard as a Sparkline chart. A Sparkline is a small line chart, typically drawn without axes or coordinates. It presents the general shape of the variation and lets me quickly see the growth trends.
Top comments (0)