s3 is handy and useful, but it would be a lot more handy and useful if we had a shell script for pushing things to buckets; a script that didn’t require a special cli tool. something portable that we could add to existing scripts or throw in a cron job or distribute to our friends. a script that we could tell “put this file in this bucket” and it would just do that. that would be a good thing. let’s build that.
note: for the impatient, this is provided as two function in a gist.
let’s start with curl
the actual uploading part of our script is going to use curl. curl is great stuff; i’m a big fan of curl. here’s a template for a curl call to put a file into an s3 bucket:
curl -s -X PUT -T "" \
-H "Host: .s3.amazonaws.com" \
-H "Date: " \
-H "Content-Type: " \
-H "Authorization: AWS :" \
https://.s3-$.amazonaws.com/
as far as curl calls go, this is fairly straighforward: four headers, an url and a source file. we’re using the http PUT
method here with the -T
argument to ‘transfer’ a file.
let’s look at each of those configuration values:
- <path to your file>: this is the path to the file on your local disk; the one you want to upload to s3.
- <your bucket name>: the name of your bucket
-
<your bucket’s aws location>: the location of your s3 bucket, ie
us-west-2
or similar. - <the date>: the current date in the format that date -R outputs. this is the RFC 5322 format. it looks something like “Mon, 22 Apr 2024 09:32:43 -0600”
-
<the file’s mime type>: the mime type of the file to upload, ie
text/html
. -
<your aws access key id>: this is the public key part of your amazon credentials. look in your
~/.aws/credentials
file for it. if you don’t have credentials yet, you’ll have to go get them. - <the name of the file to create in the bucket>: this is the name of the file you want to be created in the bucket.
- <the calculated signature>: a signature we calculate to prove to amazon that we actually have permission to do this.
we should already have almost all of this data. the exception, of course, is the ‘calculated signature’. that, as the name implies, we need to calculate.
calculating the signature
the official documentation on how to build the calculated signature is terrible and convoluted. which is a shame, since the process is actually moderately elegant.
the purpose of the signature is to prove two things: that we actually have access to the bucket, and that the signature is unique to this curl request and has not been reused. this is done by constructing a string that contains data about the file we’re uploading and date, and then creating a hash of that string and signing it with our aws secret key.
let’s look at a sample signature calculation:
s3_resource=""
content_type=""
date_value=""
aws_secret_access_key=""
string_to_sign="PUT\n\n${content_type}\n${date_value}\n${s3_resource}"
signature=`echo -en ${string_to_sign} | openssl sha1 -hmac ${aws_secret_access_key} -binary | base64`
here’s a quick once-over of the configuration values being used:
-
<the resource of the file to be created> the file we are creating in s3 in the format of
<bucket name>/<file name>
. -
<the mime type of the file> the mime of the file. it must be the same as the
Content-Type
header in the curl call. -
<the date> this is the exact same date string in the same format as we used in the
Date
header of our curl call. it looks something like “Mon, 22 Apr 2024 09:32:43 -0600” - <your aws secret key> this the secret part of your amazon credentials.
once we have all those configuration values, we’re going to assemble them into a string formatted with some unix line endings that looks something like:
PUT
text/html
Mon, 22 Apr 2024 09:32:43 -0600
mybucket/testfile.txt
we then take that string and use the openssl
command to hash it using sha1
and sign it using our aws secret key. the hash output is set to binary and is then converted to text using base64
.
when aws gets our curl request, it’s going to reconstruct this hash using the values of the Content-Type
and Date
headers we sent and the name of the file we’re uploading. since we sent aws our public key, it can look up our private key to confirm the signature. if everything matches, aws knows that we own the bucket we’re uploading to and that the signature is unique to this request and isn’t something we re-used or just copied off the internet.
once we have our calculated signature, we can send it as the value of our Authorization
header in our curl.
putting it all together
once we know how to calculate the signature and write the curl, we can apply this anywhere we want to.
for convenince, this has been written in a gist.
####
# Configuration
#
# S3 bucket and location
bucket="<your bucket name>"
location="<your aws region, ie. us-west-2>"
# AWS credentials
aws_access_key_id="<your aws access key id>"
aws_secret_access_key="<your aws secret key id>"
# Do not edit below.
####
# Calculated values
# file parts
file_path=$1
file_name=`basename "$file_path"`
# Content-Type header for curl
file_mime=`file --mime-type ${file_path}`
content_type=`echo $file_mime | awk -F ": " '{print $2}'`
# Date in format for header and signature
date_value=`date -R`
# Destination file path on s3 bucket
s3_resource="/${bucket}/`basename \"$file_path\"`"
#### FUNCTION BEGIN
# Build AWS signature for api call
# GLOBALS:
# -
# ARGUMENTS:
# s3_resource
# content_type
# date_value
# aws_secret_access_key
# OUTPUTS:
# null
# RETURN:
# String. The signature
### FUNCTION END
function build_sig() {
s3_resource=$1
content_type=$2
date_value=$3
aws_secret_access_key=$4
string_to_sign="PUT\n\n${content_type}\n${date_value}\n${s3_resource}"
signature=`echo -en ${string_to_sign} | openssl sha1 -hmac ${aws_secret_access_key} -binary | base64`
echo "$signature"
}
#### FUNCTION BEGIN
# PUT file to S3 bucket
# GLOBALS:
# -
# ARGUMENTS:
# file_path
# bucket
# location
# date_value
# content_type
# aws_access_key_id
# signature
# OUTPUTS:
# null
# RETURN:
# void
### FUNCTION END
function put_s3() {
file_path=$1
bucket=$2
location=$3
date_value=$4
content_type=$5
aws_access_key_id=$6
signature=$7
file_name=`basename "$file_path"`
curl -s -X PUT -T "${file_path}" \
-H "Host: ${bucket}.s3.amazonaws.com" \
-H "Date: ${date_value}" \
-H "Content-Type: ${content_type}" \
-H "Authorization: AWS ${aws_access_key_id}:${signature}" \
https://${bucket}.s3-${location}.amazonaws.com/${file_name}
}
# entry point
# Build signature for this API call
signature=`build_sig "$s3_resource" "$content_type" "$date_value" "$aws_secret_access_key"`
# PUT the file to the s3 bucket
put_s3 "$file_path" "$bucket" "$location" "$date_value" "$content_type" "$aws_access_key_id" "$signature"
exit 0
🔎 this post was originally written in the grant horwood technical blog
Top comments (0)