Do you ever store files alongside your Python files, and want to read them from within a running Python script?
Using import
works great on Python modules and packages, but import
will not work on non-Python data files, such as text files (including JSON, HTML, csv, etc.) or binary files (such as images).
importlib.resources
vs __file__
You could hack something together with the __file__
variable, which refers to the current Python module as a file in the filesystem. And many developers do this.
However, importlib.resources
exists, since Python 3.7, and presents a more reliable and better-looking way to load data files.
An example
To demonstrate, let's experiment with a plain text file alongside a Python module, in a package.
We'll start with a simple project with Poetry (feel free
to use the tool of your choice instead, or read my intro to installing and using Poetry.
poetry new --src hello
cd hello
poetry install
In src/hello
, we can create two files, greeting.txt
and greet.py
.
greeting.txt
:
Hello, {recipient}!
greet.py
"""Tools for greeting others."""
import importlib.resources
def greet(recipient):
"""Greet a recipient."""
template = importlib.resources.read_text("hello", "greeting.txt")
return template.format(recipient=recipient)
Now, you can launch a python console with poetry run python
, and try the following:
>>> from hello import greet
>>> greet.greet("World")
'Hello, World!\n'
>>> greet.greet("Universe")
'Hello, Universe!\n'
>>>
Note the call to importlib.resources.read_text()
. This is what reads the contents of the file specified. There are other goodies available, such as importlib.resources.read_binary()
. See the importlib.resources
docs for further explanation and examples.
An alternative, and why it isn't as good
Of course, this also works:
import pathlib
def greet2(recipient):
"""Greet a recipient, hackily."""
template = pathlib.Path(__file__).parent.joinpath("greeting.txt").read_text()
return template.format(recipient=recipient)
There is no shame in this implementation, but it just feels a bit more hacky, using a dunder (double underscore) variable, getting the parent directory, etc. Not a big deal though.
What is a big deal is that Python packages can be bundled together in different ways. One of these ways is in a zip file. The file reference will not work properly in such a zip file, while the importlib.resources
will work fine.
In addition, importlib.resources
makes convenient use of the import path. Let Python find the package and file in question; no need to work out the paths relative to file when you are loading data from files that are located in other packages. In other words, if you can import hello
then you can importlib.resources.read_text("hello"...)
no matter what script or module you are currently running.
Continue to dunder
There is one clever hack I should still mention. The line in greet()
could be written:
template = importlib.resources.read_text(__package__, "greeting.txt")
The difference is that instead of naming the package explicitly with "hello", we use the dunder variable __package__
that returns the name of the existing package.
This, of course, only works if you are attempting to load resources from the same package that contains the call to importlib.resources
. If you move the function to a different package, it will fail.
So, while not recommended (explicit is better than implicit), at least you are aware of this option.
The backport: importlib_resources
Want to use importlib.resources
with older versions of Python, such as 3.5 or 3.6?
Thankfully, the importlib_resources
package exists.
You may poetry add importlib-resources
or pip install importlib-resources
then use importlib_resources
in place of any importlib.resources
in this article's examples.
Other reading
As you design your modules and packages, you may find it helpful to read my brief intro to package/module structure in Python.
Happy developing!
Top comments (0)