You've written your script and then you noticed that it is a single file with 30 functions, 5 global variables, some main instructions. This is difficult to skim and, in consequence, difficult to read and edit in the future. You should separate all your code into sections: first you can try and make some big comments, but the code may still look dense after that; this means it is time to create some namespaces.
A namespace is an environment in which symbols have a unique meaning. In Python and many other programming languages in the C tradition, one function makes a namespace (for most uses, also called a "scope"): the variables declared in its body won't be accessible outside of it, unless you explicitly allow that -- which is considered a bad style. There are other ways to create a namespace, and making use of that is a great idea to organize your code; and it is not me saying that, it is Tim Peters:
Sparse is better than dense.
(...)
Namespaces are one honking great idea -- let's do more of those!
-- The Zen of Python, by Tim Peters
Let's see some ways to create namespaces. Aiming for modularity is a good practice in any programming languages; the following examples are in Python, but I can see them being applied very similarly in JavaScript and other languages too.
The file system way: modules
This is what people usually think about first when planning on creating namespaces.
If you import any module (except when using from ... import *
, which is not recommended most of the times), the objects exported by that module will be available in dot notation:
import itertools
for letter in itertools.cycle('ABCD'):
...
cycle('ABCD') # NameError: name 'cycle' is not defined
In the code snippet, the itertools
module was imported and is only accessed with the itertools
symbol. You could make your own function called cycle
, and that would be different from itertools.cycle
. The most important, though, is that you can split your project into different files and organize code like this:
def get_movie_posters():
...
def download_poster():
...
def parse_poster_response():
...
def get_movie_ratings():
...
movie_posters = get_movie_posters()
movie_ratings = get_movie_ratings()
def analyze_movies(posters, ratings):
...
analyze_movies(movie_posters, movie_ratings)
As this:
import posters
import ratings
import analyzer
analyzer.run(posters.get(), ratings.get())
For that, you need to distribute the code into different files. Assuming the snippet above is in a main.py
file you want to run, your folder structure could be:
project/
main.py
analyzer.py
ratings.py
posters.py
It is necessary that the modules you create do not have the same name as a module in the standard library. It is also possible to create subfolders, as long as all of them have a __init__.py
(it can just be an empty file with that name), as well as execute the project folder as a whole (we would call that a "package"). For more details, check the documentation.
It may happen, however, that you are not free to create extra files. Despite that, it is still possible to have multiple namespaces in a single script by using the solutions below. The code may not become shorter, but it would at least be more organized.
The OOP-inspired way: classes
Python classes work pretty much as plain namespaces:
from abc import ABC
from typing import final
@final
class PostersNamespace(ABC):
@staticmethod
def get():
...
@staticmethod
def _download_poster():
...
@staticmethod
def _parse_poster_response():
...
You can have objects shared between the functions in the class, but not necessarily with the rest of the code:
from abc import ABC
from typing import final
@final
class PostersNamespace(ABC):
url = 'https://www.postersite.com'
@classmethod
def get(cls):
return cls.url
PostersNamespace.get() # https://www.postersite.com
PostersNamespace.url # https://www.postersite.com
url # NameError: name 'url' is not defined
I call this "OOP-inspired" (and not just "OOP") because we are using classes only for their namespace-like behavior in Python - they are not being used as a template for an object, so they should not be instantiated (that's why we are inheriting from ABC
in the snippets above) or be used as a parent class (that's why we are annotating the classes as @final
1).
You could achieve a similar effect with dictionaries, but it is not as easy/organized to insert functions with multiple statements into them:
postersNamespace = {
"get": lambda: ... # You either rely on lambda functions only...
"download_poster": download_poster # ...or use a function defined elsewhere,
# but then the namespace won't be very delimited visually
}
Of course, you can also go full OOP and make those classes more than mere namespaces and actually initialize objects out of them:
class Poster:
url = 'https://www.postersite.com'
@staticmethod
def __init__(self, movie):
...
movies = [...]
posters = [Posters(movie) for movie in movies]
This, however, requires you to manage the state of more objects -- and take care so that the code doesn't become spaghetti again.
The functional way: closures
In Python, you can create functions inside of functions. Make use of that to separate sections in your code:
def posters():
url = 'https://www.postersite.com'
def _download_poster():
...
def _parse_poster_response():
...
...
return posters
Differently from the OOP-style solution, this limits the access you have to the different objects in the namespace, as you're restricted to the returned object. One work-around is to make functions return other functions (which will still have access to the original scope):
def make_get_poster_function():
url = 'https://www.postersite.com'
def _download_poster():
...
def _parse_poster_response():
...
def get_poster(movie):
r = requests.get(url)
...
return get_poster
In the snippet above, the function returned by make_get_poster_function()
is able to be executed successfully, as the reference to "url" inside of it goes along with it when it is returned.
Conclusion
These are tips to make a single script more organized - maybe by splitting it into multiple scripts, maybe by separating the code it contains into different namespaces defined inside of it. Managing a big number of namespaces (something that invariably occurs in large applications) requires you to go some steps further and carefully plan the architecture you will follow. This is a separate topic that raises questions on the restraints posed by the stack you want to use and personal preferences. As a starting point to one architectural style I personally like, see this post by Dan Haiduc.
Cover image by engin akyurt on Unsplash
-
The
@final
decorator does not prevent inheritance during runtime, only during static type checking (when using Mypy, for example). Python does not provide a way to totally prevent a class from inheriting from another one; you can only simulate that with custom dunder methods (see here and here). ↩
Top comments (0)