Below are 5 steps to take to build production-ready python applications.
1. Use Virtual Environment to Isolate Program.
The development environment in most cases is different from the production environment. For instance, the development environment for a program will be the programmer's machine (laptop, desktop or tablet). The production environment will be a virtual machine instance (AWS instance, Azure Virtual Machine, or Linode Standard) or a containerized instance(Docker, Kubernetes). To isolate the program in both environments with its Python version and modules, use the Python Virtual Environment in both environments.
The example below shows how to create and activate an isolated environment
# Create virtual environment.
> python -m venv ./env
# Activate virtual environment on Windows
> ./env/script/activate.bat && ./env/scripts/activate
# Activate virtual environment on Mac
.env/bin/activate
Furthermore, required Python packages(with their required versions) can be installed and pushed to a file that'll hold the packages for the program.
# Install the Scrapy module via pip
(env) > pip install scrapy===2.4.1
# Push package to file
(env) > pip freeze > requirement.txt
The example above installs a specific package with the required version. The package is then pushed to a requirement.txt
file using the pip freeze
command. The requirement.txt
file is used for specifying the Python packages the program requires, while the pip freeze
command is used for outputting installed packages names with their correct version.
In a different environment, the virtual environment can be recreated, and the required packages with the required versions specified in the requirement.txt
file can be installed. Installing the packages on a Mac is quite easy as it only requires running the pip install
command with the -r
switch on the requirement.txt
file. But on windows, the packages have to installed by copying them from the requirement file list.
# Installing required packages on a Mac
(env)> pip intalled -r requirement.txt
# Installing required packages on a Windows
(env)> pip install scrapy===2.4.1
2. Use Config Files for Defining Deployment Environment.
The deployment environment for an app has to maintain proper configuration for the app to work correctly. The environment is where the app is run either in development or production. An app is said to be in production when it leaves the development stage.
The configuration setting of the production environment is different from that of the development environment. To modify a program on the production server, you'd have to run it on the development environment. The new changes are pushed to production.
The issue with these environments is the differences in configurations. The production environment requires different settings that may be hard to reproduce in the development environment. For instance, you have a program running on a web server with access to external data via API. To modify the code, you'd have to start the webserver container, set up API configuration and the appropriate keys needed to access the external data. These steps are unnecessary and time-consuming if all you want to do is modify a part of your program, then ensure everything works correctly.
A workaround to the above scenario is to modify your program at startup time to provide different functionalities depending on the deployment environment. Have a dedicated configuration for the program on both the development and production environment
# settings.py
# Don't run with 'TESTING' turned on in production
TESTING = True
# API credentials should be kept secrete in production
API_CREDENTIAL = {
"consumer_key": "XXXXXXXXXXXXXX",
"consumer_secret": "XXXXXXXXXXXXXX",
"access_token": "XXXXXXXXXXXXXX",
"access_secret": "XXXXXXXXXXXXXX",
}
The TESTING constant is set to the Boolean value of True by default. It determines how the app runs on development and production. The API_CREDENTIAL constant is a key-value pair of the required API key.
Depending on the value of the TESTING variable, other modules in the app can import the settings.py file and determine how they implement their attributes.
# main.py
import settings
from typing import List
class TestingAPI:
"""
Use mock data in development
"""
# ...
class RealAPI:
"""
Uses real data via API in production
"""
def __init__(self, api_credentials: dict) -> None:
# ...
# ...
if settings.TESTING:
api = TestingAPI
else:
api = RealAPI(settings.API_CREDENTIALS)
With the above example, modules should be customized to perform differently in deployment environments. Doing so makes it easy to skip the unnecessary reproduction code like API or database connection when it's not needed. Mock API data can be generated and injected into the program when testing or developing.
Another similar instance might be to make the app work differently based OS. Say the host server used in production is a different OS, say Linux and the one for development is a Windows, this might break the app because of the differences in OS type.
The Python sys
module should be used to inspect the OS and determine its type.
# main.py
import sys
class LinuxEnv:
# ...
class WindowsEnv:
# ...
if sys.platform.startswith('linux'):
config = LinuxEnv
elif sys.platform.startwith('wind32'):
config = WindowEnv
else:
# ...
3. Debug Using the repr
Built-in Function
Basic debugging in Python is done using the print
function. The print function returns a human-readable string version of whatever argument passed to it. If there is an error on a python program, the print
function can be used to output how the state of the program changes while it runs to see where the error occurred.
The issue with this way of debugging is that the function can only output human-readable string version of the value supplied to it. It doesn't output what the type of the value is.
For instance, the print
function output below doesn't make clear if the value type is a Number or a String.
# Number
print(1024)
# String
print("1024")
# Human-readable output
>>>
1024
1024
The print
function is for outputting human-readable values. This doesn't help at all when debugging.
Python provides the repr
built-in function to return a printable representation of an object. It can be used in conjunction with the print
statement know and ensure value types when debugging.
# Number
print(repr(1024))
# String
print(repr("1024"))
# printable representation output
>>>
1024 # number
'1024' # string
The same result is reached employing the C-style %r
format string, and the %
operator.
# Number
print("%r" %1024)
# String
print("%r" %"1024")
# printable representation output
>>>
1024 # number
'1024' # string
Debugging Dynamic Objects
When debugging dynamic objects, using the human-readable function gives the same value as the repr
function. That means the print
function can be used because using repr
on Object instances isn't helpful.
class Person(object):
def __init__(self, name, age, height):
self.name = name
self.age = age
self.height = height
obj = Person('John', 25, "6'5")
print(obj)
repr(obj)
>>>
<__main__.Persson object at 0x0000011671395160> # human-readable string
'<__main__.Persson object at 0x0000011671395160>' # object representation
The are two ways to resolve this problem:
- Use the
__repr__
special method. - Use object instance dictionary when you don't control the class.
The __repr__
special method can only be used in classes that you control. It should define and return a string expression of the created object.
class Person(object):
def __init__(self, name, age, height):
self.name = name
self.age = age
self.height = height
def __repr__(self):
return f'Person({self.name}, {self.age}, {self.height})'
obj = Person('John', 25, "6'5")
repr(obj)
>>>
"Person(John, 25, 6'5)"
When you don't have control over the class, use the __dict__
special attribute to get access to the object instance dictionary. The __dict__
attribute returns a dictionary of class attributes and methods.
class Person(object):
def __init__(self, name, age, height):
self.name = name
self.age = age
self.height = height
def __repr__(self):
return f'Person({self.name}, {self.age}, {self.height})'
obj = Person('John', 25, "6'5")
print(obj__dict__)
>>>
{'name': 'John', 'age': 25, 'height': "6'5"}
4. Use Reusable Components
Write functions or classes that will be reused in other parts of the program to create a flow. For instance, A program that reads data from a data source(API, database, AWS s3), loads model from a pickle file, uses the model to generate predictions based on the dataset, and save the predictions in a database.
To achieve the instance above, the code responsible for handling the process could be divided into components rather than using a single function or class. Each component implies a different process that can be assembled with other components to create a pipeline for the required flow.
def read_data_from_api(args, info):
#...
return data
def load_model(info):
#...
return model
def run_predictions(data, model):
#...
return predictions
def save_predictions_to_db(args, predictions):
#...
def main:
"""Prediction pipeline"""
data = load_data_from_api(args, info)
model = load_model(args)
Predictions = run_predictions(arg, data, model)
save_predictions_to_db(args, predictions)
#...
Benefits of using the component approach:
- Component can be reused in other pipelines
- Easy to improve and modify components overtime
In the code sample above, all four components are assembled as pipeline in the mian()
function.
5. Test Program with Unittest
Python is a dynamically typed programming language. That means that it doesn't have a static type checker by default. Not having a static type checker often results in runtime errors.
All programs should be tested regardless of the programming language used, but Python is specifically limited as it has no type checking, at least not by default. Fortunately, the unittest module can be used to test python programs. Python dynamism makes it easy to write to test.
Testing is ensuring good code quality. It gives the programmer assurance that the program will work as expected when deployed. The responsible programmer should always build with testing in mind.
To use the built-in unittest module on your code, it has to be imported in a different python file. For instance, say you have a utility function defined in utils.py.
# utils.py
def to_str(data):
if isinstance(data, str):
return data
elif isinstance(data, bytes):
return data.decode("utf-8")
else:
raise TypeError("Must supply string or byte. Found %r" %data)
To perform a test, you need to create a second file with the word test
followed by the name of the file you want to test, which is the utils.py
file. So the name of the file will be test.utils.py
.
#test_utils.py
from unittest import TestCase, main
from utils import to_str
class UtilsTestCase(TestCase):
def test_to_str_bytes(self):
# Verifies equality
self.assertEqual("hello", to_str(b’hello’))
def test_to_str_str(self):
# Verifies boolean expression
self.assertEqual("hello", to_str(‘hello’))
def test_to_str_bad(self):
#verifies exception is raised
self.assertRaises(TypeError, to_str, object())
if __name__ == "__main__":
main()
Each test method begins with the word test. If a test method runs without raising any exception, then the test is successful. The tests above are organized according to test cases in the TestCase
subclasses. The subclasses include helper methods for making assertion tests, such as assertEqual
, assertNotEqual
, assertRaises
, and assertTrue
.
NOTE:
To learn more about the testing and the unittest module, Click the link to the python documentation on testing. https://docs.python.org/3/library/unittest.html
Conclusion
So there you have it, 5 guidelines on building production-ready python apps. Don't forget to check out the full article to see all 7 steps.
Top comments (9)
Forget venv and requirements.txt; use pipenv instead and you'll never have to look back again.
While unittest will serve you well for simple test suites, you'll find it short of giving you what you want after a certain limit. At this time pytest will serve you much better while still having compatibility with already existing unittest test cases. pytest also supports a plugin mechanism, allowing you to integrate many other checks(like code style, static code analysis and even type checking with python hints).
I don't think we should forget venv. At least that's the conclusion I get from stackoverflow.com/questions/415735... which documents the mess in Python's package management / "virtual envs" solutions...
I don't see the referenced stackoverflow question is saying anything against pipenv.
Agree wholeheartedly on both
pipenv
andpytest
.Thanks for this post,
repr
is new for me, the rest would be two years ago.I would recommend some changes to the post:
pipenv
andpytest
might be better options.Thank you for your feedback Jan. Much appreciated!
Great post
Thank you!
I for one will really find this helpful as I started my python journey not to long ago. Nice 👌