DEV Community

Cover image for Bundling Python Environments in a ZIP Archive
Jürgen Hermann
Jürgen Hermann

Posted on • Originally published at jhermann.github.io on

Bundling Python Environments in a ZIP Archive

Shipping dependencies for your scripts as a single file, built with ‘shiv’.

The Basic Idea

If you have a set of Python scripts that are all using the same set of required packages, you can distribute those dependencies in the form of a zipapp, i.e. in a single executable file. See Building Zipapps (PEP 441) for details if you're new to the concept of zipped Python application bundles

Unlike shipping a script in a virtualenv built within a single project, you can have a project for the base libraries and other projects for the scripts, including scripts written by end users who are just using your dependencies.

You can also deploy any PyPI package that way, with a simple call of shiv, as shown in the next section using Pandas.

A Practical Example

The following example uses the well-known Pandas data science library, but this works for any project built with setuptools or any other build tool creating Python packages that declare their requirements.

So, to create your base library release artifact, install and call shiv like this:

python3.8 -m pip install --user shiv
python3.8 -m shiv -p '/usr/bin/python3.8 -IS' \
                  -o ~/bin/_lib-pandas pandas==1.0.1

Do this in a virtualenv and leave out the --user option if you want to keep your account's home directory clean.

Note that we do not provide an entry point here, which means this zipapp drops into the given Python interpreter and is thus usable as an interpreter, with the contained packages available for import.

Now we can exploit this to write a script using the zipapp as its interpreter:

cat >script <<'EOF'
#! /usr/bin/env _lib-pandas
import re
import sys
from pathlib import Path
import pandas as pd

print('Using Pandas from',
      Path(pd. __file__ ).parent.relative_to(Path.home()),
      '\n\nPython path:')
df = pd.DataFrame(sys.path, columns=['Path'])
df.Path = df.Path.str.replace(f'^{ re.escape(str(Path.home())) }/', '~/')
print(df)
EOF
chmod +x script
./script

Calling the script produces the following output:

Using Pandas from .shiv/_lib-pandas_23b2…d2/site-packages/pandas 

Python path:
                                                Path
0 ~/bin/_lib-pandas
1 /usr/lib/python38.zip
2 /usr/lib/python3.8
3 /usr/lib/python3.8/lib-dynload
4 ~/.shiv/_lib-pandas_23b2bb7d64c26139950435a64d...

If you're familiar with Pandas, you'll instantly recognize the Python path output as coming from a Pandas data frame. 🎉

This first execution is a bit slow on startup, because the cache directory you see at the end of the Python path has to be populated first. shiv's boot-strapping code unpacks extension packages containing native code into the file system, so the OS can load them.

The underscore prefix in the zipapp name indicates this is not a command humans would normally use. Alternatively and especially in production you can deploy into e.g. /usr/local/lib/python3.8/ and then use an absolute path instead of an env call as the script's interpreter.

Top comments (2)

Collapse
 
demianbrecht profile image
Demian Brecht • Edited

Thanks for sharing! Would you happen to know what the differences between this and pex are? I've only used pex while researching this kind of packaging but am keen to learn about the new hotness if this is it

Collapse
 
jhermann profile image
Jürgen Hermann

Motivation & Comparisons — shiv documentation

PEX does multi-platform even with extension packages, though I did not try that yet.