DEV Community

Arun Kumar for AWS Community Builders

Posted on

Python Best Practices

What are the goals here?

Define a set of advice for working with Python in the context of AWS.

In what context is this guide written?

Best practices with Python…

  • As a scripting tool — Boto3 calls to AWS APIs.
  • As a language for Lambda in “Serverless” consumable — Lambda function code, with 3rd party dependencies.
  • As a language for Lambda in infra pipeline — Housekeeping, CFN Custom Resources. Compilers, Runners, Deployment.

Required Reading
Documentation

Python space
[https://www.python.org/dev/peps/pep-0008/]
[https://www.python.org/doc/sunset-python-2/]

AWS Space
[https://docs.aws.amazon.com/lambda/latest/dg/best-practices.html]
[https://docs.aws.amazon.com/lambda/latest/dg/lambda-python.html]
[https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html]

Best Practices for Python Versions

  • Python 2 has been Sunsetted — the sunset date has now passed; it was January 1st, 2020.
  • Be mindful of external dependencies between Python 2 and 3. Conclusion: Use Python 3 unless you can’t.

Best Practices for Tooling

a. SublimeText3

b. VSCode

Best Practices for CLI (i.e. Bash)

Use the “if main” statement

  • Put your script’s logic into a “main” method.
  • Then use the “if main” convention to call that method if invoked via the CLI (i.e. python3 myscript.py)
if __name__ == '__main__':
    main(__get_args())
Enter fullscreen mode Exit fullscreen mode

Best Practices for AWS APIs

a. Use Boto3

  • Has its own retry logic IIRC
  • Configurable

b. Cater for failure scenarios

Configuring Boto3

from botocore.config import Config

__sts_client = boto3.client(
    'sts',
    config=Config(connect_timeout=15, read_timeout=15, retries=dict(max_attempts=10))
)
Enter fullscreen mode Exit fullscreen mode

Best Practices for AWS Lambda

You can write python code that can work either as CLI or via AWS invocation

a. Can put a “cli.py” alongside main.py, and invoke that way

That simulates an AWS Step Function (which usually invokes the Lambda)

b. Can put an “if main” statement for running via CLI

Be mindful of limits

a. Account level

  • 250 ENIs per Account (soft limit, talk to your AWS TAM)
  • 1,000 concurrent executions

b. Use provisioned concurrency to avoid cold-start workarounds

  • Needs Hyperplane

Best Practices for Serverless

Packaging code

  • Create a self-contained zip for each lambda, AFTER installing pip modules
Code:
  S3Key:
    Fn::Pipeline::FileS3Key:
      Path: consumer.zip
Enter fullscreen mode Exit fullscreen mode

Using common-pip-* modules

  • A way to share modules between lambdas in the SAME repo
  • Can reference 3rd party modules
  • Can reference common-pip modules

Example:

dnspython
git+https://git-codecommit.ap-southeast-1.amazonaws.com/v1/repos/common-pip-log@master#egg=common-pip-log
Enter fullscreen mode Exit fullscreen mode

And the usage in your main.py code:

  • Set the first import path to be the "lib" subfolder of the lambda
from os.path import dirname, abspath
from os import environ
import sys
sys.path.insert(0, "{}/lib".format(dirname(abspath(__file__))))

import log

log.info('env', dict(os.environ))  # os.environ is not serializable by itself, cast to dictionary.
Enter fullscreen mode Exit fullscreen mode

Best Practices for Code and Design

The Zen of Python

  • “Write with future developers in mind” — they have to clean up your messes.

  • Note that “future developer” might very well be you, 6 months later after you’ve forgotten everything you did.

Here’s Python’s:

TODO — Go through each one and give reference examples.

C:/Users/ak>python
Python 3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 23:11:46) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
>>>
Enter fullscreen mode Exit fullscreen mode

Write your code as modules

  • Be mindful what gets executed when your python file is imported
  • This has a direct impact with Sphinx, which imports your code to generate documentation of it
  • This allows you to share code, perhaps even creating a “common” module that other programs can import.
  • Be careful — that makes you the de-facto maintainer of that module!
  • It needs examples
  • It needs unit tests
  • It needs a CICD process
  • It needs a todo list of enhancements (I use README.md to start with)

Reduce the scope of your module interface

  • Use “non-public” (aka “private”) naming convention for internal attributes and methods NOT intended for use outside the module
  • This should be your default position, then you slowly refactor stuff to public, as needed over time

[https://www.python.org/dev/peps/pep-0008/#method-names-and-instance-variables]

  • Use a single leading underscore for “non-public” method names E.g. _get_file_contents

Dealing with strings as booleans

  • You might be passed a string property that is supposed to be a boolean. The value might be boolean-ish — true/True/TRUE/1/Yes/On/etc.

Use something like this:

def _s_to_bool(input):
    """Implicit default of false."""
    return input.lower() in ['1', 'true', 'yes', 'on']
Enter fullscreen mode Exit fullscreen mode

When writing comments, focus on the “why” rather than the “what”

Nothing more frustrating when code doesn’t explain why something has been done — you need context!

Example, see “ScanIndexForward” below:

response_iterator = dynamodb_paginator.paginate(
    TableName='core-automation-master-api-db-items',
    IndexName='parent-created_at-index',
    ExpressionAttributeNames=expression_attribute_names,
    ExpressionAttributeValues=expression_attribute_values,
    KeyConditionExpression='parent_prn = :v1',
    ProjectionExpression="#p,#n,#s,#c,#u",
    ScanIndexForward=False  # Process newer builds first (descending order) - important for logic!
)
Enter fullscreen mode Exit fullscreen mode

If that comment didn’t make it clear to future developers that there’s a reason for ScanIndexForward=False, a bug may be created in future.

Consider the strategy pattern for running code in different contexts

  • I.e. maybe you use a strategy with your log module so that you don’t output logs locally in JSON, but in Cloud you do, for CloudWatch
  • Another example — in AWS Lambda context, you get credentials from AWS Secrets Manager, or Parameter Store. Locally, you rely on environment variables instead.

Best Practices for Data Modelling and Access

  • “Upsert” is a good feature at the low level
  • For Dynamodb — for scripts I generally don’t bother with ORM/etc, I just write Boto3 API calls
  • For example use with Marshmallow + PynamoDB.

Python libraries to help

Best Practices for Testing

  • Use pytest for unit-testing

Good feature set:

  • Auto-discovery of tests
  • Fixtures
  • Plugins
  • Coverage reporting etc

Use Selenium Bindings for Python in CodeBuild

Best Practices for Dependency Management

Use pipenv to explicitly manage and validate dependencies

  • Helps to keep your dependencies consistent via lockfile (i.e. repeat builds of same code on different days)
  • Lockfile also has checksum feature to ensure the correct package is downloaded in future (i.e. can detect future compromises)

Best Practices for security

  • Use the “safe” methods for YAML/JSON/XML parsers. Example:
client = sys.argv[1]

with open('../../{}-config/hosted-zones.yaml'.format(client)) as f:
    client_vars = yaml.safe_load(f.read())
Enter fullscreen mode Exit fullscreen mode

Best Practices for User Documentation

a. Use Sphinx
Use an editor plugin to help with formatting, especially tables

b. Focus on a few key areas:

Goals / context
High level design
Use cases
Working examples for people to pull apart and re-use

Top comments (0)