Anden Acitelli

Posted on Oct 11, 2022 • Updated on May 23, 2023 • Originally published at andenacitelli.com

The Zen of Python, Explained

#python #programming

Overview

Python is a language famously known for an extreme focus on making code "Pythonic", which roughly means making it readable, straightforward, and concise. These recommendations are contained in a "manifesto" of sorts called "The Zen of Python":

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

I'll be outlining what each of these mean and why each should be followed.

I work for Akkio, where I'm building out a no-code predictive AI platform. If you're looking to harness the power of AI without needing a data scientist, give us a try!

Beautiful is better than ugly.

Python's syntax was designed specifically to be as readable and self-descriptive as possible. You could try and one-line everything, but that's a nightmare to maintain and isn't readable. My favorite example is this Leetcode answer, where reducing things to a one-liner makes them utterly incomprehensible:

# Good
def minimumTotal(self, triangle):
    def combine_rows(lower_row, upper_row):
        return [upper + min(lower_left, lower_right)
                for upper, lower_left, lower_right in
                zip(upper_row, lower_row, lower_row[1:])]
    return reduce(combine_rows, triangle[::-1])[0]

# Bad
return reduce(lambda a,b:[f+min(d,e)for d,e,f in zip(a,a[1:],b)],t[::-1])[0]

Even though the top is more lines of code, it has clearer variable names, easier-to-understand logic, and simply reads more like English, making it a lot more readable.

Explicit is better than implicit.

If your code requires a comment explaining some detail that the code doesn't self-describe, your code could generally be written better. Implicit code results in code that's generally not self-descriptive and readable. You want your code to have all the relevant detail included, even if it becomes a bit verbose.

Simple is better than complex.

Don't introduce more logic than needed to solve a problem. Python has a lot of really cool language features that let you state your logic in a more concise but still very readable way.

Here's a great example of this, where we filter a function of objects based on whether they have an attribute set to a value:

def get_elements_with_attribute(dicts: Dict[str, str], key: str, val: str) -> List[object]:
    output = []
    for obj in objects:
        if obj[key] == value:
            output.push(objects)
    return output

A more concise and readable way to accomplish this is to use Python's list comprehension:

# Good (list comprehension)
def get_elements_with_attribute(dicts: Dict[str, str], key: str, val: str) -> List[object]:
    return [x for x in objects if x[key] === val]

You can also do this very cleanly with the filter function.

def get_elements_with_attribute(dicts: Dict[str, str], key: str, val: str) -> List[object]:
    return list(filter(lambda x: x[key] == val, objects))

map(), filter(), and reduce() are staples of functional programming, which is often used because its declarative approach leads to fewer mistakes than an iterative approach like is used in the "bad" example above.

Complex is better than complicated.

Despite sounding similar to the last one, this is slightly distinct. This means that it's okay for things to be done in several steps as long as those several steps are well thought out and are legitimately the most straightforward way to do something. Some things genuinely are complicated and make more sense when the logic is spread out a bit.

Flat is better than nested.

Especially in callback-heavy languages like JavaScript prior to widespread async/await, you end up with a lot of nesting in your code. Something like this is common:

a(function(resultsFromA) {
  b(resultsFromA, function(resultsfromB) {
    c(resultsFromC, function(resultsFromC) {
      console.log(resultsFromC)
   }
  }
}

It's difficult to look at this at first glance and figure out the exact flow of everything. Compare this to what an async/await version might look like:

const resultOfA = await a()
const resultOfB = await b(resultOfA)
const resultOfC = await c(resultOfB)

Another place you may see this is with the "early return" pattern (often called "function guards" or similar), where instead of introducing nesting, you return early in certain conditions and can then implicitly assume the inverse of the condition for the rest. Here's an example:

# Before
def a(condition1, condition2):
    if condition1:
        if condition2:
            print("Both passed!")
        else:
            raise Exception("Issue with condition 2")
    else:
        raise Exception("Issue with condition 1")


# After
def a(condition1, condition2):
    if not condition1:
        raise Exception("Issue with condition 1")
    if not condition2:
        raise Exception("Issue with condition 2")
    print("Both passed!")

Python much prefers the second (flatter) option, as it's much easier to follow the logic paths. Essentially making these "assertions" early in the function allows you to make assumptions later in the function, really cutting down on the logic paths you have to reason around.

This "rule" is a large reason why Python is whitespace-sensitive. Keeping developers cognizant of the cognitive complexity and logic paths present in their function helps keep code simpler and more readable.

Sparse is better than dense.

Sparse code doesn't try and jam too much information into very few lines of code. In other words, there's a point where making your code more and more concise actually detracts from code quality. The same Leetcode question I mentioned further up in the page is a perfect example of this:

# Good
def minimumTotal(self, triangle):
    def combine_rows(lower_row, upper_row):
        return [upper + min(lower_left, lower_right)
                for upper, lower_left, lower_right in
                zip(upper_row, lower_row, lower_row[1:])]
    return reduce(combine_rows, triangle[::-1])[0]

# Bad
return reduce(lambda a,b:[f+min(d,e)for d,e,f in zip(a,a[1:],b)],t[::-1])[0]

In the first example, each line has more space to breathe, and has a little less going on. The first would be much easier to debug; if I had to debug the second, I'd probably need to refactor it to something sparser before I even started.

Readability counts.

Code is written once, but code is read many, many times. It's read during a pull request, whenever newer members are trying to understand it, whenever someone's trying to build on that code at a later point, and much, much more.

"Readability Counts" means you should put special care into optimizing for the reader rather than the writer. Google notably has a very strong affirmative stance on this (I'd recommend the book Software Engineering at Google a thousand times over).

Special cases aren't special enough to break the rules. Although practicality beats purity.

This essentially means that you should have a very good reason to break best practice, as breaking best practice generally means you're doing something wrong.

However, if you do have a good reason, valid exceptions exist. Just make sure whatever you're doing is well-justified and documented.

Errors should never pass silently. Unless explicitly silenced.

The default should be to make every error visible, and only explicitly silence something after you've logically reasoned out that it isn't an issue. The logic behind this is that it's better to have too much output but have the issue still stated, rather than have too little output and then struggle to figure out what's going on.

In other words, you should always write exception-safe code. try-except statements are vital in accomplishing this.

In the face of ambiguity, refuse the temptation to guess.

When something breaks, it's all too easy to fall into the trap of trying to mentally think through everything and make guesses at a fix until it works. Python pushes for errors being consistently caught and thrown, which helps you methodically narrow down the precise issue and fix it rather than wandering aimlessly.

There should be one-- and preferably only one --obvious way to do it. Although that way may not be obvious at first unless you're Dutch.

There's a common paradigm across Software Engineering of frameworks and such either being opinionated vs. non-opinionated. An opinionated system generally has a "correct" way to do something, whereas a non-opinionated system often leaves the door open to a few "valid" ways to do something.

A great example of this is Angular vs. React. Angular is a largely opinionated framework with official packages that are usually the best way to do something, which keeps everyone on the same page. React is largely unopinionated, and while there are generally still "correct" ways to do something, the framework itself doesn't inherently push any "correct" way to do things.

Leaving this door open leads to harder-to-maintain code. You're less likely to find applicable examples on Google and elsewhere in your codebase. For this reason, Python recommends that you minimize the valid ways to do something, which introduces more consistency and readability across your codebase.

Now is better than never. Although never is often better than right now.

The first part essentially means that it's better to get something out there that works, and then improve it, rather than focus on getting it absolutely perfect the first time.

However, the second of these is a counterpoint basically just saying that this doesn't mean to rush everything. Your first implementation should still be good, just don't be a perfectionist.

If the implementation is hard to explain, it's a bad idea. If the implementation is easy to explain, it may be a good idea.

If you have trouble explaining your implementation to coworkers, that's a "code smell" that you need a rewrite. Even if you have to sacrifice a bit of performance or conciseness, implementing something in an easy-to-understand way helps with readability and consistency.

Namespaces are one honking great idea -- let's do more of those!

Namespacing your code is conceptually similar to the push for modularity as a whole, which helps keep your code:

Cohesive, meaning everything in one file has a single unified purpose
Lowly Coupled, meaning files interface with each other via as few "touch points" as possible.

Namespacing also helps you be more deliberate about what functions are in scope, helping avoid name conflicts.

Sources

Here are a few resources I used when writing this guide:

DEV Community