managing your aws resources with command line utilities or via configuration files is pretty commonplace. but if you want to manipulate your aws programatically, there are significantly fewer options.
one of the best sdks for aws is python's boto3
. in this post, we're going to get a quick overview of boto3 by writing a short script that creates an s3 website.
the flyover
in this walkthrough, we're going to:
- scaffold a basic script and install our dependencies, then write our script to:
- get our aws credentials from the aws cli config file
- instantiate our
boto3
client - create our s3 bucket, set all the permissions and properties to make it a website, and push a basic index page
at the end of it, we'll have a script that will not only create s3 websites, but also highlight the power and flexibility of boto3
.
once we have this down pat we will have built a foundation to start exploring the documentation, and we'll be well on our way to being able to command and control our aws resources from the comfort of our python scripts.
prerequisites
before we can do any of this, there are a few things we need to have:
- an aws account
- an
access_key
andsecret_access_key
located in our our aws credentials file at~/.aws/credentials
- python 3.8 or higher
a quick note about aws credentials
most documentation about using aws from the command line assumes that you have a default access and secret key in your credentials
file. we're not going to do that.
instead, we will be using explicitly named profiles, without a default. this means our credentials
file will look something like this:
[fruitbat]
aws_access_key_id = AKIAZJHIA3DT43AMLX9B
aws_secret_access_key = pGxpeFj2xKru2a/pOUgINp2Vpb3Cau0DvOhwWsNk
[cloverhitch]
aws_access_key_id = AYJASHHOA1BT29YKON3A
aws_secret_access_key = rHtffHn1fLir4b/qIMqjMw5Ctc1Dau0DrIkqWtSa
each one of those pairs of access and secret keys is a 'profile', with the name of the profile given in the square brackets. we will be referencing our aws credentials explicitly by that profile name.
doing this makes for a little more work, but the payoff is safety. it's very easy to rack up a lot of different aws credentials, especially if you do client work (i have 23!), and it is worthwhile to invest a bit of effort into making sure you're always running your scripts or aws-cli
commands against the right account.
scaffolding our script
we're going to start by creating a basic script and adding some dependencies with pip
.
first, let's create a directory for our script called, aptly enough, s3sitemaker
, and build a virtual environment in it.
mkdir s3sitemaker
cd s3sitemaker
virtualenv -p python3.10.4 venv
source venv/bin/activate
obviously, you should adjust your virtualenv to the version of python you want, so long as it's higher than 3.8.
next, in our virtual environment, we'll use pip to install some dependencies.
pip install botocore
pip install boto3
pip install configparser
technically, we do not need to install botocore
here. botocore is the library that boto3
is built with. we can access botocore calls directly, if we wish, and as we get more familiar with boto we may find that advantageous. however, for almost all of our needs, the boto3 sdk will be sufficient. botocore
is shown here as separate install only to highlight the fact that it exists as a separate, but related, package.
we also install configparser
here. we will be using this to read and parse our aws credentials
file.
once we have our dependencies in place, we will sketch out our scripts basic functionality.
in the text editor of your choice, create a file called s3sitemaker.py
add this content:
# -*- coding: utf-8 -*-
import sys
import boto3
def get_aws_credentials():
"""
Gets the aws credentials from ~/.aws/credentials
for the aws profile name
"""
pass
def make_boto3_client():
"""
Creates boto3 clients for s3
"""
pass
def create_bucket():
"""
Creates the s3 bucket and sets it for web hosting
"""
pass
def main():
get_aws_credentials()
make_boto3_client()
create_bucket()
if __name__ == '__main__':
"""
Entry point
"""
main()
there's not a whole lot going on here, just some empty functions, but it's a good overview of the steps we'll be taking to create our s3 bucket website.
first, we're going to parse our aws credentials
file and get our access and secret keys in get_aws_credentials
. we will then use those keys to create a boto3 'client' to interact with s3 in make_boto3_client
.
the boto3 sdk is designed around 'clients'. there is one client (or more) per aws service. so, for instance, there's clients for s3, route 53, cloudfront and so on. the official documentation publishes a full list. there are, not surprisingly, a lot.
once we have our client, we will call create_bucket
to use the methods in our boto3 client to:
- create our bucket
- update its permissions
- set it as a website
- upload a sample
index.html
file
let's work through all those in order.
getting our aws credentials
in order to access our aws account, we need a key pair of an access key and a secret key provided by aws. if you do not have these, you can refer to the aws documentation on the topic.
our credentials are stored in the configuration file at ~/.aws/credentials
, with our key pairs in individual profiles. if you don't have a credentials
file, again you can refer to the aws documentation page about credentials.
once we have our credentials set up, we can write our get_aws_credentials
function that reads our credentials
config file and parses out the access and secret key for use later.
let's take a look at the full script with the function.
note that if you have read and parsed config files in python before, none of this will be particularly new.
# -*- coding: utf-8 -*-
import sys
import boto3
# configuration
# --> NEW
# the name of the aws profile
aws_profile = "fruitbat-sls"
# --> NEW
# aws credentials read in from ~/.aws/credentials
aws_access_key_id = None
aws_secret_access_key = None
aws_region = "us-east-1"
# --> NEW
def get_aws_credentials():
"""
Gets the aws credentials from ~/.aws/credentials
for the aws profile name
"""
import configparser
import os
# the aws profile we configured
global aws_profile
# the global variables where we store the aws credentials
global aws_access_key_id
global aws_secret_access_key
# parse the aws credentials file
path = os.environ['HOME'] + '/.aws/credentials'
config = configparser.ConfigParser()
config.read(path)
# read in the aws_access_key_id and the aws_secret_access_key
# if the profile does not exist, error and exit
if aws_profile in config.sections():
aws_access_key_id = config[aws_profile]['aws_access_key_id']
aws_secret_access_key = config[aws_profile]['aws_secret_access_key']
else:
print("Cannot find profile '{}' in {}".format(aws_profile, path), True)
sys.exit()
# if we don't have both the access and secret key, error and exit
if aws_access_key_id is None or aws_secret_access_key is None:
print("AWS config values not set in '{}' in {}".format(aws_profile, path), True)
sys.exit()
def make_boto3_client():
"""
Creates boto3 clients for s3
"""
pass
def create_bucket():
"""
Creates the s3 bucket and sets it for web hosting
"""
pass
def main():
"""
"""
get_aws_credentials()
make_boto3_client()
create_bucket()
if __name__ == '__main__':
"""
"""
main()
the first thing to note here is some global variables:
-
aws_profile
: the name of the profile in ourcredentials
file that we will be using -
aws_access_key_id
: where we will assign our access key after it has been parsed from thecredentials
file -
aws_secret_access_key
: where we will assign our secret key after it has been parsed from thecredentials
file -
aws_region
: our aws region. here it is hardcoded tous-east-1
, but we can, obviously, change that to whatever we want
the get_aws_credentials
function itself is fairly straightforward: we open and parse the credentials
file and then assign the access and secret keys to our variables.
we start with importing the os
module, so we can build the path to our credentials
file, and the configparser
so we can parse it.
next is the heart of the function: parsing the config file. we find the home directory of the user using os.environ['HOME']
and use that to create the full path to credentials
. once we have the path, we instantiate a configparser
object and call the read
method to load and parse our credentials
file.
after that, all that's left is some error checking and assigning. we confirm that our configured profile name is actually in our credentials
file by checking the sections
of our config object. if it is, we assign our access and secret keys. if not, we error and exit.
finally, we confirm that both our access and secret key are not null. if we have values, we can proceed to the next step.
make the boto3 client
once we have the aws credentials, we can use them to make a boto3 client for s3.
here's the function:
# -*- coding: utf-8 -*-
import sys
import boto3
# configuration
# the name of the aws profile
aws_profile = "fruitbat"
# aws credentials read in from ~/.aws/credentials
aws_access_key_id = None
aws_secret_access_key = None
aws_region = "us-east-1"
# --> NEW
# boto3 clients
boto3_client_s3 = None
def get_aws_credentials():
"""
Gets the aws credentials from ~/.aws/credentials
for the aws profile name
"""
pass
# --> NEW
def make_boto3_client():
"""
Creates boto3 clients for s3
"""
# the client object
global boto3_client_s3
# create the s3 client
try:
boto3_client_s3 = boto3.client(
's3',
aws_access_key_id=aws_access_key_id,
aws_secret_access_key=aws_secret_access_key,
region_name=aws_region,
)
except Exception as e:
print(e)
sys.exit()
def create_bucket():
"""
Creates the s3 bucket and sets it for web hosting
"""
pass
def main():
"""
"""
get_aws_credentials()
make_boto3_client()
create_bucket()
if __name__ == '__main__':
"""
"""
main()
the first thing to note is the global boto3_client_s3
. this is where we store the client.
in the make_boto3_client
we're basically only doing one thing: calling the client
method on boto3
to get an s3 client. most of the arguments the client
method takes are the components of our aws credentials. the exception is the first argument where we pass the string 's3'. boto3 returns a different client for each service aws offers. so, for instance, if we were working with route 53, we would set this to 'route53' instead and receive a client with the methods needed to make changes to route 53.
create our bucket website
now that we have our boto3
client for s3, we can move on to actually making our bucket website.
here's the code:
# -*- coding: utf-8 -*-
import sys
import boto3
# configuration
# the name of the aws profile
aws_profile = "fruitbat"
# --> NEW
bucket_name = "ourfancybucketsite"
# --> NEW
index_file = "/tmp/index.html"
# aws credentials read in from ~/.aws/credentials
aws_access_key_id = None
aws_secret_access_key = None
aws_region = "us-east-1"
# boto3 clients
boto3_client_s3 = None
# --> NEW
# the aws data on the s3 site
s3_site_endpoint = None
s3_bucket_name = None
def get_aws_credentials():
"""
Gets the aws credentials from ~/.aws/credentials
for the aws profile name
"""
pass
def make_boto3_client():
"""
Creates boto3 clients for s3
"""
pass
# --> NEW
def create_bucket():
"""
Creates the s3 bucket and sets it for web hosting
"""
global s3_site_endpoint
global s3_bucket_name
global bucket_name
global index_file
# create the s3 bucket
try:
response = boto3_client_s3.create_bucket(
ACL="public-read",
Bucket=bucket_name,
)
except boto3_client_s3.exceptions.BucketAlreadyExists as err:
print("Bucket {} already exists.".format(err.response['Error']['BucketName']))
sys.exit()
# set the policy
policy = """{
"Version": "2008-10-17",
"Id": "PolicyForPublicWebsiteContent",
"Statement": [
{
"Sid": "PublicReadGetObject",
"Effect": "Allow",
"Principal": {
"AWS": "*"
},
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::%s/*"
}
]
}""" % (bucket_name)
#
try:
boto3_client_s3.put_bucket_policy(
Bucket=bucket_name,
Policy=policy
)
except Exception as e:
print(e)
sys.exit()
# make bucket a website
try:
boto3_client_s3.put_bucket_website(
Bucket=bucket_name,
WebsiteConfiguration={
'ErrorDocument': {
'Key': 'index.html'
},
'IndexDocument': {
'Suffix': 'index.html'
}
}
)
except Exception as e:
print(e)
sys.exit()
# get bucket website to confirm it's there
try:
boto3_client_s3.get_bucket_website(Bucket=bucket_name)
except Exception as e:
print(e)
sys.exit()
# build bucket name
s3_bucket_name = bucket_name + ".s3.amazonaws.com"
# upload the inedex file
boto3_client_s3.upload_file(index_file, bucket_name, "index.html", ExtraArgs={'ContentType': "text/html", 'ACL': "public-read"})
# build the endpoint for the s3 site
s3_site_endpoint = bucket_name + ".s3-website-" + aws_region + ".amazonaws.com"
# print out the url of the site
print(s3_site_endpoint)
def main():
"""
"""
get_aws_credentials()
make_boto3_client()
create_bucket()
if __name__ == '__main__':
"""
"""
main()
there's a lot going on here, so we'll take it step-by-step.
first, let's look at the new variables we've made:
bucket_name = "ourfancybucketsite"
index_file = "/tmp/index.html"
# the aws data on the s3 site
s3_site_endpoint = None
s3_bucket_name = None
the first is bucket_name
. as the name implies, this is the name of the bucket we'll be creating. bucket names need to be unique and, as we'll see shortly, our create_bucket
function will error and exit if we've not chosen a unique name.
next we have index_file
, which is the path to the file on our system that we want to upload to our bucket as an index page.
the last two variables will be set by our create_bucket
function: s3_site_endpoint
will hold the url of site when we're done and s3_bucket_name
will be the amazon name of the bucket. if we decide to expand the script later on, these are the values we will be using. for now, however, once these are set, we're done.
creating the bucket
our create_bucket
function does a lot of things. but the first task that needs to be accomplished is to create a bucket:
this is the boto3 call that does that:
response = boto3_client_s3.create_bucket(
ACL="public-read",
Bucket=bucket_name,
)
here we call create_bucket
on our boto3 s3 client, passing two arguments. the ACL
argument is our 'access control list'. since we want our webiste to be readable by the world, we pass 'public-read' here.
ACLs for buckets are normally messy XML documents (which you can read about if you're interested). they're difficult to work with. fortunately, aws provides a suite of pre-written ACL documents that we can reference by name. these are called 'canned ACLs'. we're using one of those here. it's called 'public-read'.
our second argument, called Bucket
, is the name of our bucket.
boto3's s3 client's create_bucket
method has a lot of features that allow you adjust things like access and ownership. they are all well documented on boto3 s3 create_bucket documentation.
of course, this call can fail. s3 buckets need to be unique, so we catch the exception BucketAlreadyExists
and exit on error.
set the bucket policy
one of the first things we do when creating an s3 in the web console is set the bucket policy in the permissions tab.
doing this in boto3 is fairly straightforward. first, we create our policy XML:
policy = """{
"Version": "2008-10-17",
"Id": "PolicyForPublicWebsiteContent",
"Statement": [
{
"Sid": "PublicReadGetObject",
"Effect": "Allow",
"Principal": {
"AWS": "*"
},
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::%s/*"
}
]
}""" % (bucket_name)
and then we apply it to our bucket with the put_bucket_policy
method.
boto3_client_s3.put_bucket_policy(
Bucket=bucket_name,
Policy=policy
)
note that this method takes two arguments: the name of the bucket we are applying the policy to, and the policy itself.
if we have misformatted our XML, put_bucket_policy
will raise an Exception. we catch and handle that.
make our bucket a website
this is what we've been waiting for.
turning your regular s3 bucket into a web-serving bucket is very straightforward: just a call to put_bucket_website
boto3_client_s3.put_bucket_website(
Bucket=bucket_name,
WebsiteConfiguration={
'ErrorDocument': {
'Key': 'index.html'
},
'IndexDocument': {
'Suffix': 'index.html'
}
}
)
this example is the simplest possible implementation. we call put_bucket_website
and pass just two arguments: the name of our index page and the page to show in the event of an error. don't be confused by the nameing convention here: the error document may ask for a Key
and the index document for a Suffix
, but really they both want a file name.
of course, put_bucket_website
can take a lot of other potential arguments, depending on what you want it to do. there is a full list on the boto3 documentation for s3.put_bucket_website.
validate our website exists
shortly, we're going to be uploading an index page to our new s3 website, but first we're going to do a fast check to confirm the website exists. we do that with:
boto3_client_s3.get_bucket_website(Bucket=bucket_name)
the get_bucket_website
method just returns the website configuration data, which is outlined in the boto3 documentation. we can choose to inspect that in detail if we wish, or simply test to see that no errors are raised.
upload an index file
lastly, we're going to put an index file in our bucket so there's actually something there to browse.
let's investigate how we do that:
# upload the inedex file
boto3_client_s3.upload_file(index_file, bucket_name, "index.html", ExtraArgs={'ContentType': "text/html", 'ACL': "public-read"})
this method takes four arguments. the first, index_file
, is the path to our index file that we set in our script. the bucket_name
, of course, is the name for our bucket we've been using throughout this script.
the third argument is the string literal "index.html". this is the name we want the file to have in the bucket. this allows us to change file names as we upload them.
lastly, we have ExtraArgs
. this is a dict containing, well, 'extra' arguments. here, we're going to explictly set our file's ContentType
to html. we definitely do not want amazon's http server to assume that our index file is some sort of binary, after all. we're also going to explicitly set the access control on the file to public-read
. if you remember back when we created our bucket, we used the 'canned' public-read
ACL to set the permissions bucket-wide. we're doing the same thing here, except for one file only.
finishing off
now that we have our s3 site created and some minimal content installed, we can browse it. let's build the url for the site and output it.
# build the endpoint for the s3 site
s3_site_endpoint = bucket_name + ".s3-website-" + aws_region + ".amazonaws.com"
# print out the url of the site
print(s3_site_endpoint)
if we navigate to the url in s3_site_endpoint
, we should see our index file.
conclusion
creating an s3 website is the least of the things you can do with boto3; virtually every feature of every aws service can be created, destroyed and manipulated with it. i encourage you to look at the extensive list of clients, read the first-rate documentation, and experiment.
Top comments (0)