Jerry Ng

Posted on Jul 14, 2021 • Edited on Mar 22, 2022 • Originally published at jerrynsh.com

How To Write Clean Code in Python

#python #programming #codequality #webdev

What exactly is “clean code”? Generally speaking, clean code is code that is easy to understand and easy to change or maintain.

As code is more often read than written, practice writing clean code is crucial in our career.

Today, I am sharing some tips that I have gathered over the years while also giving some examples in Python.

With that said, these principles should generally apply to most programming languages.

TL;DR

Be consistent when naming things
Avoid room for confusion when naming things
Avoid double negatives
Write self-explanatory code
Do not abuse comments

1. Name Things Properly

Avoid any room for confusion

Despite being the oldest trick in the book, this is the simplest rule that we often forget. Before naming a folder, function, or variable, always asks “if I name it like this, could it mean something else or confuse other people?”

The general idea here is to always remove any room for confusion while naming anything.

# For example, you're naming a variable that represents the user’s membership:

# Example 1
# ^^^^^^^^^
# Don't
expired = True

# Do
is_expired = True

# Example 2
# ^^^^^^^^^
# Don't
expire = '2021-04-17 03:25:37.403283'

# Do
expiration_date = '2021-04-17 03:25:37.403283' # OR
expiration_date_string = '2021-04-17 03:25:37.403283'

The reason why expired is a less ideal name is because expired on its own is ambiguous. A new developer working on the project wouldn’t know whether expired is a date or a boolean.

Be consistent with naming

Maintaining consistency throughout a team project is crucial to avoid confusion and doubts. This applies to variable names, file names, function names, and even directory structures.

Nothing should be named solely based on your individual preferences. Always check what other people already wrote and discuss it before changing anything.

# For example if the existing project names a Response object as "res" already:

# Existing functions
# ^^^^^^^^^^^^^^^^^^
def existing_function(res, var): 
  # Do something...
  pass 

def another_existing_function(res, var): 
  # Do something...
  pass 

# Example 1
# ^^^^^^^^^
# Don't
def your_new_function(response, var): 
  # Do something...
  pass 

# Do
def your_new_function(res, var): 
  # Do something...
  pass

Extra tips when choosing names

Variables are nouns (i.e.product_name).
Functions that do something are verbs (i.e. def compute_user_score()).
Boolean variables or functions returning boolean are questions (i.e. def is_valid()).
Names should be descriptive but not overly verbose (i.e. def compute_fibonacci() rather than def compute_fibonacci_with_dynamic_programming()).

2. Avoid Double Negatives

“Can you make sure that you do not forget to not switch off the lights later?”

Ugh. So, should I switch the lights off or not? Hang on, let me read that again.

Let’s agree that a double negative is plain confusing.

# Example to check if a user's membership is valid or not:

# Don't
is_invalid = False
if not is_invalid:
    print("User's membership is valid!")

# Do
is_valid = True
if not is_valid:
    print("User's membership is invalid!")

If you have to read it more than once to be sure, it smells.

3. Write Self-Explanatory Code

In the past, I remember being told that engineers should sprinkle comments everywhere to “improve code quality.”

Those days are long gone. Instead, engineers need to write self-explanatory code that makes sense to people. For instance, we should try to capture a complicated piece of logic in a descriptive and self-reading variable.

# Don't write long conditionals
if meeting and (current_time > meeting.start_time) and (user.permission == 'admin' or user.permission == 'moderator') and (not meeting.is_cancelled):
     print('# Do something...')

# Do capture them in many variables that reads like English
is_meeting_scheduled = meeting and not meeting.is_cancelled
has_meeting_started = current_time > meeting.start_time
has_user_permission = user.permission == 'admin' or user.permission == 'moderator'
if is_meeting_scheduled and has_meeting_started and has_user_permission:
    print('# Do something...')

Do not abuse comments

Like code itself, comments can go out of date too.

People often forget to update the comments as the code gets refactored. When this happens, comments themselves would indirectly become the root of the confusion.

Whenever you feel the need to write a comment, you should always re-evaluate the code you have written to see how it could be made clearer.

Examples of when to write comments

One of the scenarios where I would consider using comments is when I have to use slicing. This would beg questions like “Why do we do it this way? Why not other indexes?” and so on.

# Example of getting an email returned from a 3rd party API:

# Example 1
# ^^^^^^^^^
# Do
raw_string = get_user_info()
email = raw_string.split('|', maxsplit=2)[-1]  # NOTE: raw_string e.g. "Magic Rock|jerry@example.com"

Another example:

# Example of a function calling a random time.sleep():

# Example 2
# ^^^^^^^^^
# Don't
def create_user(user_ids):
    for id in user_ids:
        make_xyz_api_request(id)
        time.sleep(2)

Imagine you’re a new developer looking at the code above for the first time.

The first thing that would cross my mind is “Why are we randomly waiting two seconds for every request that we make?”

It turns out the original developer who wrote the code just wanted us to limit our number of requests sent to the third-party API.

# Do
def create_user(user_ids):
    for id in user_ids:
        make_xyz_api_request(id)
        time.sleep(2) # NOTE: service 'xyz' has a rate limit of 100 requests/min, so we should slow our requests down

Always put yourself in others’ shoes (i.e. “How would the others interpret my code?”). If you’re slicing or using a specific index from a list (i.e. array[3]), no one would know exactly why you are doing it.

How Do I Apply This Knowledge?

No one is capable of writing clean code from day one. As a matter of fact, everyone starts by writing “bad” or “ugly” code.

Like most things in life, to be good at something, you have to keep practicing over and over again. You have to put in the hours.

Besides practicing, here are the things that work for me:

Keep asking yourself questions like “Is there a better way of writing it? Is this confusing for others to read?”
Take part in code reviews.
Explore other well-written code bases. If you want some examples of well-written, clean, and Pythonic code, check out the Python requests library.
Talk to people, discuss or exchange opinions, and you will learn a lot more.

Final Thoughts

Writing clean code is hard to explain to a lot of non-technical people because. For them, it seems to provide little to no immediate value to the business impact of the company.

Writing clean code also takes up a lot of extra time and attention, and these two factors translate to costs for businesses.

Yet, over a period of time, the effect of having clean code in a codebase is crucial for engineers. With a cleaner code base, engineers will be able to deliver code and deploy applications faster to meet business objectives.

On top of that, having clean code is crucial so that new collaborators or contributors can hit the ground running faster as they start on a new project.

References

Google Python Style Guide

This article was originally published on jerrynsh.com

Top comments (9)

Naufan Rusyda Faikar • Jul 14 '21 • Edited

Python "has" a maximum limit of 80 characters, even though I don't always agree. Often, it is a great challenge for me to choose between writing self-explanatory code (or someone call it as verbose naming) or following the consensus (which is all about the Python linter). But anyway, most of the time, I prefer the former to the latter. For the reason, then I started leaving the rule and expand it to 100 or 120.

Edit: I forgot to mention, it's hard to move on from being a big fan of one-liners. Ha-ha-ha ...

Andres 🐍 in 🇨🇦 • Jul 14 '21

insted of:

expiration_date = '2021-04-17 03:25:37.403283' # OR
expiration_date_string = '2021-04-17 03:25:37.403283'

I suggest or its better

expiration_at = "2021 ..."

Arvind Padmanabhan • Jul 15 '21

Many beginners don't give importance to naming. They don't realize that many others will be reading and maintaining their code. More about naming conventions is at devopedia.org/naming-conventions

Jerry Ng • Jul 15 '21

I personally find it hard to adhere to the PEP8 max line length guideline at times especially when dealing with relatively long strings whether the line breaks could potentially look odd.

But then again this is a personal/team preference I suppose.