grant horwood

Posted on Dec 27, 2022 • Edited on Jan 10, 2023

managing your aws resources with python and boto3

managing your aws resources with command line utilities or via configuration files is pretty commonplace. but if you want to manipulate your aws programatically, there are significantly fewer options.

one of the best sdks for aws is python's boto3. in this post, we're going to get a quick overview of boto3 by writing a short script that creates an s3 website.

the flyover

in this walkthrough, we're going to:

scaffold a basic script and install our dependencies, then write our script to:
get our aws credentials from the aws cli config file
instantiate our boto3 client
create our s3 bucket, set all the permissions and properties to make it a website, and push a basic index page

at the end of it, we'll have a script that will not only create s3 websites, but also highlight the power and flexibility of boto3.

once we have this down pat we will have built a foundation to start exploring the documentation, and we'll be well on our way to being able to command and control our aws resources from the comfort of our python scripts.

prerequisites

before we can do any of this, there are a few things we need to have:

an aws account
an access_key and secret_access_key located in our our aws credentials file at ~/.aws/credentials
python 3.8 or higher

a quick note about aws credentials

most documentation about using aws from the command line assumes that you have a default access and secret key in your credentials file. we're not going to do that.

instead, we will be using explicitly named profiles, without a default. this means our credentials file will look something like this:

[fruitbat]
aws_access_key_id = AKIAZJHIA3DT43AMLX9B
aws_secret_access_key = pGxpeFj2xKru2a/pOUgINp2Vpb3Cau0DvOhwWsNk

[cloverhitch]
aws_access_key_id = AYJASHHOA1BT29YKON3A
aws_secret_access_key = rHtffHn1fLir4b/qIMqjMw5Ctc1Dau0DrIkqWtSa

each one of those pairs of access and secret keys is a 'profile', with the name of the profile given in the square brackets. we will be referencing our aws credentials explicitly by that profile name.

doing this makes for a little more work, but the payoff is safety. it's very easy to rack up a lot of different aws credentials, especially if you do client work (i have 23!), and it is worthwhile to invest a bit of effort into making sure you're always running your scripts or aws-cli commands against the right account.

scaffolding our script

we're going to start by creating a basic script and adding some dependencies with pip.

first, let's create a directory for our script called, aptly enough, s3sitemaker, and build a virtual environment in it.

mkdir s3sitemaker
cd s3sitemaker
virtualenv -p python3.10.4 venv
source venv/bin/activate

obviously, you should adjust your virtualenv to the version of python you want, so long as it's higher than 3.8.

next, in our virtual environment, we'll use pip to install some dependencies.

pip install botocore
pip install boto3
pip install configparser

technically, we do not need to install botocore here. botocore is the library that boto3 is built with. we can access botocore calls directly, if we wish, and as we get more familiar with boto we may find that advantageous. however, for almost all of our needs, the boto3 sdk will be sufficient. botocore is shown here as separate install only to highlight the fact that it exists as a separate, but related, package.

we also install configparser here. we will be using this to read and parse our aws credentials file.

once we have our dependencies in place, we will sketch out our scripts basic functionality.

in the text editor of your choice, create a file called s3sitemaker.py

add this content:

# -*- coding: utf-8 -*-

import sys
import boto3


def get_aws_credentials():
    """
    Gets the aws credentials from ~/.aws/credentials
    for the aws profile name
    """
    pass


def make_boto3_client():
    """
    Creates boto3 clients for s3
    """
    pass


def create_bucket():
    """
    Creates the s3 bucket and sets it for web hosting
    """
    pass


def main():
    get_aws_credentials()
    make_boto3_client()
    create_bucket()


if __name__ == '__main__':
    """
    Entry point
    """
    main()

there's not a whole lot going on here, just some empty functions, but it's a good overview of the steps we'll be taking to create our s3 bucket website.

first, we're going to parse our aws credentials file and get our access and secret keys in get_aws_credentials. we will then use those keys to create a boto3 'client' to interact with s3 in make_boto3_client.

the boto3 sdk is designed around 'clients'. there is one client (or more) per aws service. so, for instance, there's clients for s3, route 53, cloudfront and so on. the official documentation publishes a full list. there are, not surprisingly, a lot.

once we have our client, we will call create_bucket to use the methods in our boto3 client to:

create our bucket
update its permissions
set it as a website
upload a sample index.html file

let's work through all those in order.

getting our aws credentials

in order to access our aws account, we need a key pair of an access key and a secret key provided by aws. if you do not have these, you can refer to the aws documentation on the topic.

our credentials are stored in the configuration file at ~/.aws/credentials, with our key pairs in individual profiles. if you don't have a credentials file, again you can refer to the aws documentation page about credentials.

once we have our credentials set up, we can write our get_aws_credentials function that reads our credentials config file and parses out the access and secret key for use later.

let's take a look at the full script with the function.

note that if you have read and parsed config files in python before, none of this will be particularly new.

# -*- coding: utf-8 -*-

import sys
import boto3

# configuration

# --> NEW
# the name of the aws profile
aws_profile = "fruitbat-sls"

# --> NEW
# aws credentials read in from ~/.aws/credentials
aws_access_key_id = None
aws_secret_access_key = None
aws_region = "us-east-1"


# --> NEW
def get_aws_credentials():
    """
    Gets the aws credentials from ~/.aws/credentials
    for the aws profile name
    """
    import configparser
    import os

    # the aws profile we configured
    global aws_profile

    # the global variables where we store the aws credentials
    global aws_access_key_id
    global aws_secret_access_key

    # parse the aws credentials file
    path = os.environ['HOME'] + '/.aws/credentials'
    config = configparser.ConfigParser()
    config.read(path)

    # read in the aws_access_key_id and the aws_secret_access_key
    # if the profile does not exist, error and exit
    if aws_profile in config.sections():
        aws_access_key_id = config[aws_profile]['aws_access_key_id']
        aws_secret_access_key = config[aws_profile]['aws_secret_access_key']
    else:
        print("Cannot find profile '{}' in {}".format(aws_profile, path), True)
        sys.exit()

    # if we don't have both the access and secret key, error and exit
    if aws_access_key_id is None or aws_secret_access_key is None:
        print("AWS config values not set in '{}' in {}".format(aws_profile, path), True)
        sys.exit()


def make_boto3_client():
    """
    Creates boto3 clients for s3
    """
    pass


def create_bucket():
    """
    Creates the s3 bucket and sets it for web hosting
    """
    pass


def main():
    """
    """
    get_aws_credentials()
    make_boto3_client()
    create_bucket()


if __name__ == '__main__':
    """
    """
    main()

the first thing to note here is some global variables:

aws_profile: the name of the profile in our credentials file that we will be using
aws_access_key_id: where we will assign our access key after it has been parsed from the credentials file
aws_secret_access_key: where we will assign our secret key after it has been parsed from the credentials file
aws_region: our aws region. here it is hardcoded to us-east-1, but we can, obviously, change that to whatever we want

the get_aws_credentials function itself is fairly straightforward: we open and parse the credentials file and then assign the access and secret keys to our variables.

we start with importing the os module, so we can build the path to our credentials file, and the configparser so we can parse it.

next is the heart of the function: parsing the config file. we find the home directory of the user using os.environ['HOME'] and use that to create the full path to credentials. once we have the path, we instantiate a configparser object and call the read method to load and parse our credentials file.

after that, all that's left is some error checking and assigning. we confirm that our configured profile name is actually in our credentials file by checking the sections of our config object. if it is, we assign our access and secret keys. if not, we error and exit.

finally, we confirm that both our access and secret key are not null. if we have values, we can proceed to the next step.

make the boto3 client

once we have the aws credentials, we can use them to make a boto3 client for s3.

here's the function:

# -*- coding: utf-8 -*-

import sys
import boto3

# configuration

# the name of the aws profile
aws_profile = "fruitbat"

# aws credentials read in from ~/.aws/credentials
aws_access_key_id = None
aws_secret_access_key = None
aws_region = "us-east-1"

# --> NEW
# boto3 clients
boto3_client_s3 = None


def get_aws_credentials():
    """
    Gets the aws credentials from ~/.aws/credentials
    for the aws profile name
    """
    pass


# --> NEW
def make_boto3_client():
    """
    Creates boto3 clients for s3
    """

    # the client object
    global boto3_client_s3

    # create the s3 client
    try:
        boto3_client_s3 = boto3.client(
            's3',
            aws_access_key_id=aws_access_key_id,
            aws_secret_access_key=aws_secret_access_key,
            region_name=aws_region,
        )
    except Exception as e:
        print(e)
        sys.exit()


def create_bucket():
    """
    Creates the s3 bucket and sets it for web hosting
    """
    pass


def main():
    """
    """
    get_aws_credentials()
    make_boto3_client()
    create_bucket()


if __name__ == '__main__':
    """
    """
    main()

the first thing to note is the global boto3_client_s3. this is where we store the client.

in the make_boto3_client we're basically only doing one thing: calling the client method on boto3 to get an s3 client. most of the arguments the client method takes are the components of our aws credentials. the exception is the first argument where we pass the string 's3'. boto3 returns a different client for each service aws offers. so, for instance, if we were working with route 53, we would set this to 'route53' instead and receive a client with the methods needed to make changes to route 53.

create our bucket website

now that we have our boto3 client for s3, we can move on to actually making our bucket website.

here's the code:

# -*- coding: utf-8 -*-

import sys
import boto3

# configuration

# the name of the aws profile
aws_profile = "fruitbat"

# --> NEW
bucket_name = "ourfancybucketsite"

# --> NEW
index_file = "/tmp/index.html"

# aws credentials read in from ~/.aws/credentials
aws_access_key_id = None
aws_secret_access_key = None
aws_region = "us-east-1"

# boto3 clients
boto3_client_s3 = None

# --> NEW
# the aws data on the s3 site
s3_site_endpoint = None
s3_bucket_name = None


def get_aws_credentials():
    """
    Gets the aws credentials from ~/.aws/credentials
    for the aws profile name
    """
    pass


def make_boto3_client():
    """
    Creates boto3 clients for s3
    """
    pass


# --> NEW
def create_bucket():
    """
    Creates the s3 bucket and sets it for web hosting
    """

    global s3_site_endpoint
    global s3_bucket_name
    global bucket_name
    global index_file

    # create the s3 bucket
    try:
        response = boto3_client_s3.create_bucket(
            ACL="public-read",
            Bucket=bucket_name,
        )
    except boto3_client_s3.exceptions.BucketAlreadyExists as err:
        print("Bucket {} already exists.".format(err.response['Error']['BucketName']))
        sys.exit()

    # set the policy
    policy = """{
        "Version": "2008-10-17",
        "Id": "PolicyForPublicWebsiteContent",
        "Statement": [
            {
                "Sid": "PublicReadGetObject",
                "Effect": "Allow",
                "Principal": {
                    "AWS": "*"
                },
                "Action": "s3:GetObject",
                "Resource": "arn:aws:s3:::%s/*"
            }
        ]
    }""" % (bucket_name)

    #
    try:
        boto3_client_s3.put_bucket_policy(
            Bucket=bucket_name,
            Policy=policy
        )
    except Exception as e:
        print(e)
        sys.exit()

    # make bucket a website
    try:
        boto3_client_s3.put_bucket_website(
            Bucket=bucket_name,
            WebsiteConfiguration={
                'ErrorDocument': {
                    'Key': 'index.html'
                },
                'IndexDocument': {
                    'Suffix': 'index.html'
                }
            }
        )
    except Exception as e:
        print(e)
        sys.exit()

    # get bucket website to confirm it's there
    try:
        boto3_client_s3.get_bucket_website(Bucket=bucket_name)
    except Exception as e:
        print(e)
        sys.exit()

    # build bucket name
    s3_bucket_name = bucket_name + ".s3.amazonaws.com"

    # upload the inedex file
    boto3_client_s3.upload_file(index_file, bucket_name, "index.html", ExtraArgs={'ContentType': "text/html", 'ACL': "public-read"})

    # build the endpoint for the s3 site
    s3_site_endpoint = bucket_name + ".s3-website-" + aws_region + ".amazonaws.com"

    # print out the url of the site
    print(s3_site_endpoint)


def main():
    """
    """
    get_aws_credentials()
    make_boto3_client()
    create_bucket()


if __name__ == '__main__':
    """
    """
    main()

there's a lot going on here, so we'll take it step-by-step.

first, let's look at the new variables we've made:

bucket_name = "ourfancybucketsite"

index_file = "/tmp/index.html"

# the aws data on the s3 site
s3_site_endpoint = None
s3_bucket_name = None

the first is bucket_name. as the name implies, this is the name of the bucket we'll be creating. bucket names need to be unique and, as we'll see shortly, our create_bucket function will error and exit if we've not chosen a unique name.

next we have index_file, which is the path to the file on our system that we want to upload to our bucket as an index page.

the last two variables will be set by our create_bucket function: s3_site_endpoint will hold the url of site when we're done and s3_bucket_name will be the amazon name of the bucket. if we decide to expand the script later on, these are the values we will be using. for now, however, once these are set, we're done.

creating the bucket

our create_bucket function does a lot of things. but the first task that needs to be accomplished is to create a bucket:

this is the boto3 call that does that:

response = boto3_client_s3.create_bucket(
    ACL="public-read",
    Bucket=bucket_name,
)

here we call create_bucket on our boto3 s3 client, passing two arguments. the ACL argument is our 'access control list'. since we want our webiste to be readable by the world, we pass 'public-read' here.

ACLs for buckets are normally messy XML documents (which you can read about if you're interested). they're difficult to work with. fortunately, aws provides a suite of pre-written ACL documents that we can reference by name. these are called 'canned ACLs'. we're using one of those here. it's called 'public-read'.

our second argument, called Bucket, is the name of our bucket.

boto3's s3 client's create_bucket method has a lot of features that allow you adjust things like access and ownership. they are all well documented on boto3 s3 create_bucket documentation.

of course, this call can fail. s3 buckets need to be unique, so we catch the exception BucketAlreadyExists and exit on error.

set the bucket policy

one of the first things we do when creating an s3 in the web console is set the bucket policy in the permissions tab.

doing this in boto3 is fairly straightforward. first, we create our policy XML:

policy = """{
    "Version": "2008-10-17",
    "Id": "PolicyForPublicWebsiteContent",
    "Statement": [
        {
            "Sid": "PublicReadGetObject",
            "Effect": "Allow",
            "Principal": {
                "AWS": "*"
            },
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::%s/*"
        }
    ]
}""" % (bucket_name)

and then we apply it to our bucket with the put_bucket_policy method.

boto3_client_s3.put_bucket_policy(
    Bucket=bucket_name,
    Policy=policy
)

note that this method takes two arguments: the name of the bucket we are applying the policy to, and the policy itself.

if we have misformatted our XML, put_bucket_policy will raise an Exception. we catch and handle that.

make our bucket a website

this is what we've been waiting for.

turning your regular s3 bucket into a web-serving bucket is very straightforward: just a call to put_bucket_website

boto3_client_s3.put_bucket_website(
    Bucket=bucket_name,
    WebsiteConfiguration={
        'ErrorDocument': {
            'Key': 'index.html'
        },
        'IndexDocument': {
            'Suffix': 'index.html'
        }
    }
)

this example is the simplest possible implementation. we call put_bucket_website and pass just two arguments: the name of our index page and the page to show in the event of an error. don't be confused by the nameing convention here: the error document may ask for a Key and the index document for a Suffix, but really they both want a file name.

of course, put_bucket_website can take a lot of other potential arguments, depending on what you want it to do. there is a full list on the boto3 documentation for s3.put_bucket_website.

validate our website exists

shortly, we're going to be uploading an index page to our new s3 website, but first we're going to do a fast check to confirm the website exists. we do that with:

boto3_client_s3.get_bucket_website(Bucket=bucket_name)

the get_bucket_website method just returns the website configuration data, which is outlined in the boto3 documentation. we can choose to inspect that in detail if we wish, or simply test to see that no errors are raised.

upload an index file

lastly, we're going to put an index file in our bucket so there's actually something there to browse.

let's investigate how we do that:

# upload the inedex file
boto3_client_s3.upload_file(index_file, bucket_name, "index.html", ExtraArgs={'ContentType': "text/html", 'ACL': "public-read"})

this method takes four arguments. the first, index_file, is the path to our index file that we set in our script. the bucket_name, of course, is the name for our bucket we've been using throughout this script.

the third argument is the string literal "index.html". this is the name we want the file to have in the bucket. this allows us to change file names as we upload them.

lastly, we have ExtraArgs. this is a dict containing, well, 'extra' arguments. here, we're going to explictly set our file's ContentType to html. we definitely do not want amazon's http server to assume that our index file is some sort of binary, after all. we're also going to explicitly set the access control on the file to public-read. if you remember back when we created our bucket, we used the 'canned' public-read ACL to set the permissions bucket-wide. we're doing the same thing here, except for one file only.

finishing off

now that we have our s3 site created and some minimal content installed, we can browse it. let's build the url for the site and output it.

# build the endpoint for the s3 site
s3_site_endpoint = bucket_name + ".s3-website-" + aws_region + ".amazonaws.com"

# print out the url of the site
print(s3_site_endpoint)

if we navigate to the url in s3_site_endpoint, we should see our index file.

conclusion

creating an s3 website is the least of the things you can do with boto3; virtually every feature of every aws service can be created, destroyed and manipulated with it. i encourage you to look at the extensive list of clients, read the first-rate documentation, and experiment.

DEV Community