Writing clean, Pythonic code is all about making it as understandable, yet concise, as possible. This is the first part of a series on Python refactorings, based on those that can be done automatically by Sourcery. Here I am focusing on why these changes are good ideas, not just on how to do them.
1. Merge nested if conditions
Too much nesting can make code difficult to understand, and this is especially true in Python, where there are no brackets to help out with the delineation of different nesting levels.
Reading deeply nested code is confusing, since you have to keep track of which conditions relate to which levels. We therefore strive to reduce nesting where possible, and the situation where two if
conditions can be combined using and
is an easy win.
Before:
if a:
if b:
return c
After:
if a and b:
return c
2. Hoist repeated code outside conditional statement
We should always be on the lookout for ways to remove duplicated code.
An opportunity for code hoisting is a nice way of doing so.
Sometimes code is repeated on both branches of a conditional. This means that the code will always execute. The duplicate lines can be hoisted out of the conditional and replaced with a single line.
if sold > DISCOUNT_AMOUNT:
total = sold * DISCOUNT_PRICE
label = f'Total: {total}'
else:
total = sold * PRICE
label = f'Total: {total}'
By taking the assignment to label
outside of the conditional we have removed a duplicate line of code, and made it clearer what the conditional is actually controlling, which is the total.
if sold > DISCOUNT_AMOUNT:
total = sold * DISCOUNT_PRICE
else:
total = sold * PRICE
label = f'Total: {total}'
3. Replace yield inside for loop with yield from
One little trick that often gets missed is that Python's yield
keyword has a corresponding yield from
for collections, so there's no need to iterate over a collection with a for loop. This makes the code slightly shorter and removes the mental overhead and extra variable used by the for loop. Eliminating the for loop also makes the yield from
version about 15% faster.
Before:
def get_content(entry):
for block in entry.get_blocks():
yield block
After:
def get_content(entry):
yield from entry.get_blocks()
4. Use any() instead of for loop
A common pattern is that we need to find if some condition holds for one or all of the items in a collection. This can be done with a for loop such as this:
found = False
for thing in things:
if thing == other_thing:
found = True
break
A more concise way, that clearly shows the intentions of the code, is to use Python's any()
and all()
built in functions.
found = any(thing == other_thing for thing in things)
any()
will return True
when at least one of the elements evaluates to True
,
all()
will return True
only when all the elements evaluate to True
.
These will also short-circuit execution where possible. If the call to any()
finds an element that evalutes to True
it can return immediately. This can lead to performance improvements if the code wasn't already short-circuiting.
5. Replace list() with []
The most concise and Pythonic way to create a list is to use the []
notation.
x = []
This fits in with the way we create lists with elements, saving a bit of mental energy that might be taken up with thinking about two different ways of creating lists.
x = ['first', 'second']
Doing things this way has the added advantage of being a nice little performance improvement.
Here are the timings before and after the change:
$ python3 -m timeit "x = list()"
5000000 loops, best of 5: 63.3 nsec per loop
$ python3 -m timeit "x = []"
20000000 loops, best of 5: 15.8 nsec per loop
Similar reasoning and performance results hold for replacing dict()
with {}
.
6. Hoist statements out of for/while loops
Another type of hoisting is pulling invariant statements out of loops. If a statement just sets up some variables for use in the loop, it doesn't need to be inside it.
Loops are inherently complex, so making them shorter and easier to understand should be on your mind while writing them.
In this example the city
variable gets assigned inside the loop, but it is only read and not altered.
for building in buildings:
city = 'London'
addresses.append(building.street_address, city)
It's therefore safe to hoist it out, and this makes it clearer that the same city
value will apply to every building
.
city = 'London'
for building in buildings:
addresses.append(building.street_address, city)
This also improves performance - any statement in a loop is going to be executed every time the loop runs. The time spent on these multiple executions is being wasted, since it only needs to be executed once. This saving can be significant if the statements involve calls to databases or other time-consuming tasks.
Conclusion
As mentioned, each of these is a refactoring that Sourcery can automatically perform for you. We're planning on expanding this blog series out and linking them in as additional documentation, with the aim of turning Sourcery into a great resource for learning how to improve your Python skills. If you have any thoughts on how to improve Sourcery or its documentation please do email us or hit me up on Twitter
Top comments (0)