Do you need to send batches of emails, synchronised to go at a set time? Are you unsure whether to develop your own campaign management tools, or buy off-the-shelf? Have you been through our Getting Started Guide, and are inspired to send your first campaign, but are feeling a bit nervous about writing your own code?
A customer of ours recently had this same need - sending out email batches of a few million emails each morning. They looked at some great fully-featured campaign management tools from SparkPost partners such as Iterable, Ongage, and Cordial that cover this need, and lots more besides. When you have many different campaign types, complex ‘customer journey’ campaigns, and integrated WYSIWYG editors - they're a good option.
If, however, you're looking for something simple, there is another way - and you're in the right place! SparkPost’s Python library and our built-in scheduled sending feature make it easy to put something together.
We’ll put ourselves in the shoes of our friendly fictional company, Avocado Industries, and follow them through setting up a campaign. This article takes you through various features of SparkPost’s Python client library, and links to the final code here.
So what do I need?
You’re sending out a newsletter to your subscribers. You’ve created a nice looking template and uploaded it to SparkPost. You have your recipient list at hand, or can export it easily enough from your database. You want SparkPost to mail-merge the recipient personalization details in, and get your awesome send out.
These needs translate into the following design goals for the code we’re going to write:
- Specify everything about your send using parameters and a small text file. You don’t want to change the code for each campaign.
- Leverage the SparkPost stored template and personalization features, without doing any programming yourself.
- Use local flat files for the recipient lists. The same format used by SparkPost’s stored recipient list uploads is a good fit.
- Gather recipients from your list into API-call batches for efficiency - with no upper limits to the overall size of your send.
- Support timezones for the scheduled send, and also support ‘just send it now’.
Data guacamole
The SparkPost recipient-list format looks like this, with all the fields populated. Here we see Jerome’s details for the Avocado Industries loyalty scheme. We can see he’s a gold-card member, lives in Washington State, and likes several different avocado varieties.
email,name,return_path,metadata,substitution_data,tags
jerome.russell@example.com,Jerome Russell,bounce@avocado-industries.com,"{""custID"": 60525717}","{""memberType"": ""gold"", ""state"": ""WA""}","[""hass"", ""pinkerton"", ""gwen"", ""lamb hass"", ""fuerte"", ""bacon""]"
Everything apart from the email address is optional, so it would be nice to have the tool also accept just a plain old list of email addresses. It would also be nice if the tool is happy if we omit the header line. That’s easily done.
Taco me to the start
The tool is written for python3
. SparkPost relies on the pip
installer, and we’ll need git
to obtain the tool. You can check if you already have these tools, using the following commands.
$ python3 -V
Python 3.5.1
$ pip3 -V
pip 9.0.1 from /usr/local/lib/python3.5/site-packages (python 3.5)
$git --version
git version 2.7.4
If you already have them, continue to “Add SparkPost Python Library sauce” below. Otherwise here is a simple install sequence for Amazon EC2 Linux:
sudo su -
yum update -y
yum install -y python35
yum install -y wget
wget https://bootstrap.pypa.io/get-pip.py
python3 get-pip.py
yum install -y git
If you are using another platform, check out installation instructions for your platform here.
Add SparkPost Python Library sauce
We use pip3
to install, as follows.
$ sudo /usr/local/bin/pip3 install sparkpost
Collecting sparkpost
Using cached sparkpost-1.3.5-py2.py3-none-any.whl
Collecting requests>=2.5.1 (from sparkpost)
Using cached requests-2.13.0-py2.py3-none-any.whl
Installing collected packages: requests, sparkpost
Successfully installed requests-2.13.0 sparkpost-1.3.5
Get the sparkySched
code from Github using:
$ git clone https://github.com/tuck1s/sparkySched.git
Cloning into 'sparkySched'...
remote: Counting objects: 55, done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 55 (delta 0), reused 0 (delta 0), pack-reused 52
Unpacking objects: 100% (55/55), done.
Checking connectivity... done.
$ cd sparkySched
Gotta be .ini to win it
We now set up some attributes such as your API key, campaign, and certain substitution data in a text file, as they will be the same each time you send. An example is provided in the project called sparkpost.ini.example
. Rename this to sparkpost.ini
, and replace with a key you’ve created in your own SparkPost account.
[SparkPost]
Authorization = <YOUR API KEY>
#Campaign setup
Campaign = avocado-saladcopter
GlobalSub = {"subject": "Fresh avocado delivered to your door in 30 minutes by our flying saladcopter"}
And, Send!
There’s a sample file of 1000 safe test recipients included in the project that we can send to. Change the template name below from avocado-goodness
to one you have in your account, and set the sending time to suit you:
$ ./sparkySched.py recips_1k_sub_n_tags.csv avocado-goodness 2017-05-08T19:10:00+01:00
Opened connection to https://api.sparkpost.com
Injecting to SparkPost:
To 1000 recips: template "avocado-goodness" start_time 2017-05-08T19:10:00+01:00: OK - in 1.62 seconds
$
If all is well, you should see the “OK” line, and your mailing is sent. That’s all you need to do. Happy sending!
Code salsa
In this section, we take a deeper look inside the code. You can skip this if you just want to use the tool instead of changing it. Here’s how we call the SparkPost API to send messages, using the SparkPost Python library:
# Inject the messages into SparkPost for a batch of recipients, using the specified transmission parameters
def sendToRecips(sp, recipBatch, sendObj):
print('To', str(len(recipBatch)).rjust(5, ' '),'recips: template "'+sendObj['template']+'" start_time '+sendObj['start_time']+': ',end='', flush=True)
# Compose in additional API-call parameters
sendObj.update({
'recipients': recipBatch,
'track_opens': True,
'track_clicks': True,
'use_draft_template': False
})
startT = time.time()
try:
res = sp.transmissions.send(**sendObj)
endT = time.time()
if res['total_accepted_recipients'] != len(recipBatch):
print(res)
else:
print('OK - in', round(endT - startT, 3), 'seconds')
except SparkPostAPIException as err:
print('error code', err.status, ':', err.errors)
exit(1)
After some helpful on-screen output about what we’re trying to send, the function composes the recipients with the other passed-in ingredients and mixes in some sensible defaults, using sendObj.update()
.
The SparkPost library call is wrapped in a try/except
clause, as it can return errors at the application level (such as having an incorrect API key), or at the transport level (such as your Internet connection being down). This is generally good practice with any code that’s communicating with a remote service, and follows the examples packaged with our library.
We use startT
, endT
, and the time()
function to measure how long the API call actually takes. While not strictly necessary, it’s interesting to see how performance varies with batch size, routing distance from client to server, etc.
We will now craft the code to read parameters from the .ini
file and use them in the API sends. Let’s read and check the mandatory parameters:
# Get parameters from .ini file
configFile = 'sparkpost.ini'
config = configparser.ConfigParser()
config.read_file(open(configFile))
cfg = config['SparkPost']
apiKey = cfg.get('Authorization', '') # API key is mandatory
if not apiKey:
print('Error: missing Authorization line in ' + configFile)
exit(1)
baseUri = 'https://' + cfg.get('Host', 'api.sparkpost.com') # optional, default to public service
The Python library configParser
does the heavy lifting. You’ve got to have an API key, so we exit if unset. baseUri defaults to the sparkpost.com API endpoint if it’s unset. The other parameters from the .ini file are read in the same way, and are described in the project README file.
There are other ways to set things up in Python, such as using environment variables. My preference is for .ini files, because the file is right there, staring at you. It’s easy to store, communicate, change and check, right there in your project.
Chickens go in, pies come out ...
Let’s look at how to read that .csv format recipient list. Python provides a nice library, csv
. All the reading of double-quoted stuff that .csv files need to carry JSON objects like "{""custID"": 60525717}"
is taken care of for us.
We could use csv
to read the whole recipient-list into a Python array object - but that’s not a great idea, if we have squillions of addresses in our list. The client will be perfectly fast enough for our purposes, and we’ll use less client memory, if we read in just enough to give us a nice sized batch to cook each time around.
We’ll also handle line 1 of the file specially, to meet our ‘go easy on the optional file header’ requirement. Recall that a well-formed header should look like this:
email,name,return_path,metadata,substitution_data,tags
If it’s got the single word ‘email’ somewhere on line 1, let’s assume it really is a header, and we’ll take our field layouts from that line. The tool will be happy if you omit optional fields, or have them in a different order on the line. The only one you absolutely need is the email field.
recipBatch = []
f = csv.reader(fh_recipList)
for r in f:
if f.line_num == 1: # Check if header row present
if 'email' in r: # we've got an email header-row field - continue
hdr = r
continue
elif '@' in r[0] and len(r) == 1: # Also accept headerless format with just email addresses
hdr = ['email'] # line 1 contains data - so continue processing
else:
print('Invalid .csv file header - must contain "email" field')
exit(1)
We then check if it’s really a headerless file with just a bunch of email addresses in it, by checking for a single entry with an @ sign.
In the main loop for i,h in enumerate(hdr)
we use some nice Python language features to conform the data to the JSON object that SparkPost is expecting. The name
field needs to be put inside the address.name
JSON attribute. Return_path is added, if present. Metadata, substitution_data and tags all come in to us as JSON-formatted strings, so we unpack them using json.loads()
.
# Parse values from the line of the file into a row object
row = {}
for i,h in enumerate(hdr):
if r[i]: # Only parse non-empty fields from this line
if h == 'email':
row['address'] = {h: r[i]} # begin the address
elif h == 'name':
row['address'].update(name = r[i]) # add into the existing address structure
elif h == 'return_path':
row[h] = r[i] # simple string field
elif (h == 'metadata' or h == 'substitution_data' or h == 'tags'):
row[h] = json.loads(r[i]) # parse these fields as JSON text into dict objects
else:
print('Unexpected .csv file field name found: ', h)
exit(1)
recipBatch.append(row)
if len(recipBatch) >= batchSize:
sendToRecips(sp, recipBatch, txOpts)
recipBatch = [] # Empty out, ready for next batch
# Handle the final batch remaining, if any
if len(recipBatch) > 0:
sendToRecips(sp, recipBatch, txOpts)
All that’s left to do, is to chew through the list, sending each time we gather in a full sized batch. We send any final batch at the end, and we’re done.
Command-line garnish - a pinch of thyme time
The last part we need is some command-line argument parsing. The recipient-list, the template-ID, and the sending date/time are the things you might want to vary each time the tool is run. Python munges your arguments using argv[]
in much the same way as other languages.
There are all kinds of nonsense possible with input date and time - such as February 30th, 24:01 and so on. Mix in timezone offsets, so the user can schedule in their local time, and no-one would seriously want to write their own time parsing code! SparkPost’s API will of course be the final arbiter on what’s good, and what’s not - but it’s better to do some initial taste-tests before we try to send.
Python’s strptime()
function does mostly what we want. The format string can be made like SparkPost format, except Python has no :
separator in the %z
timezone. Python’s elegant negative indexing into strings (working backwards from the end of the string) makes it easy to write a small checking function.
# Validate SparkPost start_time format, slightly different to Python datetime (which has no : in timezone offset format specifier)
def isExpectedDateTimeFormat(timestamp):
format_string = '%Y-%m-%dT%H:%M:%S%z'
try:
colon = timestamp[-3]
if not colon == ':':
raise ValueError()
colonless_timestamp = timestamp[:-3] + timestamp[-2:]
datetime.strptime(colonless_timestamp, format_string)
return True
except ValueError:
return False
Plat du jour
If you don’t want to schedule a future start_time, you can just give today’s date and time. Times in the past are sent immediately.
Depending on where your code is running (I happen to be using an AWS virtual machine), you should see each batch get sent in a few seconds. Even though it’s single-threaded, you can schedule a million emails in around ten minutes. The actual send will proceed (at the scheduled start_time) as fast as it can.
$ ./sparkySched.py recips_100k_sub_n_tags.csv avocado-goodness 2017-04-11T23:55:00+01:00
Opened connection to https://demo.sparkpostelite.com
Injecting to SparkPost:
To 10000 recips: template "avocado-goodness" start_time 2017-04-11T23:55:00+01:00: OK - in 4.97 seconds
To 10000 recips: template "avocado-goodness" start_time 2017-04-11T23:55:00+01:00: OK - in 4.92 seconds
To 10000 recips: template "avocado-goodness" start_time 2017-04-11T23:55:00+01:00: OK - in 4.783 seconds
And that’s pretty much it. The full code, which is just over 100 actual lines, is here with easy installation instructions.
A small digestif ...
What’s the best way to test this out for real? A tool to generate dummy recipient-list with sinkhole addresses could be handy. Keep an eye out for a follow-up blogpost.. I’ve included a couple of ready-made recipes recipient files in the github project to get you started.
Is this your first dining experience with Python and SparkPost? Did I add too much seasoning? Should the author be pun-ished for these bad jokes? Let us know!
P.S. Want to talk more Python with us? Join us in our Community Slack.
This post was originally posted on the SparkPost blog.
Top comments (0)