Managing your Cloud Costs with CloudOps Automation Part 1: Identifying Your Resources with Tags

#cloudcosts #intelligentautomatio #tags

Moving systems to the cloud makes a lot of sense operationally – letting the experts take care of the infrastructure, and let us build what we need to make our company successful.

But this comes at a substantial downside – your monthly cloud bill. Cloud providers have made it insanely easy to spin up new servers and features, but without careful auditing – it is *very* easy to leave money on the table due to unused or improperly sized resources remaining active on our cloud provider. In this series of posts, we’ll use unSkript to uncover unused assets in your cloud account, and then either alert the team of their existence, or remove them automatically.

This of course leads to a catch 22 – how do you know what is safe to remove – and what will bring down production? In order to better identify our cloud resources, we need a good tagging strategy.

So, to kick off our cost savings series of posts, we’ll begin with a discussion on tagging your cloud resources.

Why Tag your resources?

Tags are key:value pairs that describe your cloud function. With AWS, you can use any value for your key – giving the ultimate in customization. AWS has a number of best practices for tagging your resources.

Tagging allows us to easily identify our cloud components and quickly determine what the components do. AWS recommends “overtagging” vs “undertagging.” In many ways, this is the CloudOps analogy to commenting code. With a ‘well-tagged” set of resources, auditing each instance becomes easier.

Perhaps most importantly – it becomes easy to find resources that are no longer needed, allowing for them to be turned off keeping your cloud bill in check.

Tag Strategy

There is no definitive set of tags that should be used, but here are some that are often discussed:

Environments: If your different environments are all in the same Cloud, labeling each object with an Environment key, and values like development, staging, production help you understand where in the deployment process your instances lie.
*Department: * Tagging the department that owns/controls the resource has a number of important features:
2. The team can track their cloud installs.
3. If there is a problem with the instance, the correct team can be easily identified and notified that there is an issue.
*Cost center: * Identify teams with higher spends. More easily break down budgeting for cloud billing.
*Expiration: * When building out a new system, you may deploy a number of instances for testing. By setting a sunset date, you can remove any worry of accidentally leaving a cloud instance live – they will all shut down within a few days or weeks.

Building your tagging strategy.

If you have not yet built out your strategy, you probably have dozens (hundreds) of untagged cloud objects. Since there is no way for one person or team to know what each instance is, or even if it is still in use, we need to add tags.

So the first step is to identify each object, and contact the owners to add tags to the instances.

Of course, it would be best to give your team an automated way to add tags to their instances, so that they do not lose a lot of bandwidth complying with the new requirements (with the added benefit that an easy onboarding will make your new tagging policy go faster with less friction. Here’s how we have done this at unSkript (using unSkript xRunBooks and Actions, or course).

Step 1 Find resources with zero tags

unSkript has a built in action called “ AWS Get Untagged Resources.” This Action calls all EC2 instances that have no tags attached to them, We can search for this Action and drag it into our workflow. (In our Free Sandbox, create a new xRunBook, and then search for the action)

Connect your AWS Credentials (learn how to create a connection to AWS), and add your Region (you can either change the value in the configurations to the right, OR change the parameters in the top menu – this refers to the variable Region). When run, this Action gives a list if instanceIds that have no tags. We’d like a bit more information, so we’ll edit the xRunBook to look like this:

def aws\_get\_untagged\_resources(handle, region: str) -> List:



    print("region",region)

    ec2Client = handle.client('ec2', region\_name=region)

    #res = aws\_get\_paginator(ec2Client, "describe\_instances", "Reservations")

    res = aws\_get\_paginator(ec2Client, "describe\_instances", "Reservations")

    result = []

    for reservation in res:

        for instance in reservation['Instances']:       

            try:

                #has tags

                tagged\_instance = instance['Tags']

            except Exception as e:

                #no tags

                result.append({"instance":instance['InstanceId'],"type":instance['InstanceType'],"imageId":instance['ImageId'], "launched":instance['LaunchTime'] })

    return result

We make these changes to give a bit more information about each instance that is untagged. For example:

[{‘imageId’: ‘ami-094125af156557ca2’,

‘instance’: ‘i-049b54f373769f51b’,

‘launched’: datetime.datetime(2022, 12, 14, 17, 48, 49, tzinfo=tzlocal()),

‘type’: ‘m1.small’},

Now we can reach out to the rest of the team to see if anyone knows about this m1.small instance launched on 12/14/22 from a specific AMI.

Step 2: Add tags to found instances:

We now have a list of all of the instanceIds that have no tags. Now we can use a new action that attaches tags to an EC2 instance to begin the process of bringing the instance into tagging compliance.

This action has 3 inputs: _instanceId_, _Tag\_Key_, and _Tag\_Value_.def aws\_tag\_resources(handle, instanceId: str, tag\_key: str, tag\_value: str, region: str) -> Dict:

    ec2Client = handle.client('ec2', region\_name=region)
    result = {}
    try:
        response = ec2Client.create\_tags(
            Resources=[
                instanceId
            ],
            Tags=[
                {
                    'Key': tag\_key,
                    'Value': tag\_value
                },
            ]
        )
        result = response
    except Exception as error:
        result["error"] = error
    return result

Running this Action adds the key:value tag into the EC2 instance.

Step 3: Compliance check

Finally, we’ll build one last Action that checks all tag Keys against the required list of keys, and returns those instances that are mossing a required tag:

def aws\_get\_resources\_out\_of\_compliance(handle, region: str, requiredTags: list) -> List:

    ec2Client = handle.client('ec2', region\_name=region)
    #res = aws\_get\_paginator(ec2Client, "describe\_instances", "Reservations")
    res = aws\_get\_paginator(ec2Client, "describe\_instances", "Reservations")
    result = []
    for reservation in res:
        for instance in reservation['Instances']:       
            try:
                #has tags
                allTags = True
                keyList = []
                tagged\_instance = instance['Tags']
                #print(tagged\_instance)
                #get all the keys for the instance
                for kv in tagged\_instance:
                    key = kv["Key"]
                    keyList.append(key)
                #see if the required tags are represented in the keylist
                #if they are not - the instance is not in compliance
                for required in requiredTags:
                        if required not in keyList:
                            allTags = False
                if not allTags:
                    # instance is not in compliance
                    result.append({"instance":instance['InstanceId'],"type":instance['InstanceType'],"imageId":instance['ImageId'], "launched":instance['LaunchTime'], "tags": tagged\_instance})

            except Exception as e:
                #no tags               result.append({"instance":instance['InstanceId'],"type":instance['InstanceType'],"imageId":instance['ImageId'], "launched":instance['LaunchTime'], "tags": []})
    return result

This Action reads in a list of required keys, and if an instance does not have all of them – it is returned in an out of compliance list.

Conclusion

It has been shown that tagging cloud instances makes troubleshooting faster. It also helps you identify cloud objects that are no longer in use – helping you to reduce your cloud bill. For these reasons it makes sense to create a tagging requirement for all instances.

In this post, we have created a series of Actions that will help you simplify the transition process of bringing all of your existing cloud objects into tagging compliance.

Feel free to try these Actions in our Free Sandbox, or using our Docker install. The actions used in this post will soon be available on Github in the xRunBook Add Mandatory Tags to EC2. Please reach out if you’d like a copy earlier!