loading...

Python beyond PEP8

edelvalle profile image Eddy Ernesto del Valle Pino ・3 min read

If you write Python and don't know what PEP8 is go and check it now.

PEP8 is the style guide for Python code and I think is quite good and I very much encourage people to put a linter + static analyzer as first step before running tests in CI/CD systems as a smoke test and keep things kind of tidy (at the micro level, I'm not talking about architecture here).

But this guide line lacks in some places where it promote styles that are fragile and could be improved:

Fragile indentation

This is a style that is acceptable by PEP8, but it is fragile.

# Aligned with opening delimiter.
foo = long_function_name(var_one, var_two,
                         var_three, var_four=4)

Something is fragile when it breaks easily, and is quite easy to break this indentation by changing the name of foo or long_function_name and instantaneously this the parameters in the second line would be miss-aligned.

# Aligned with opening delimiter.
foo = refactoring(var_one, var_two,
                         var_three, var_four=4)

There are some editors that will correct this for you automatically but not everybody has those. Also sometimes I find abandoned code like this from someone who had to escape a building on fire or something.

And is not very pleasant when long function calls are together looking something like:

# Aligned with opening delimiter.
foo = long_function_name(var_one, var_two,
                         var_three, var_four=4)
barf = another_long_function_name(var_one, var_two,
                                  var_three, var_four=4)

Where the amount of spaces is irregular even when semantically both lines are at the same level. Is just visually misleading.

If we just listed the parameters of the function one per line with trailing commas:

# Aligned with opening delimiter.
foo = long_function_name(
    var_one, 
    var_two,
    var_three, 
    var_four=4,
)

In this case when you change the name of the function, remove or add parameters, or change the name of the parameters the rest of the indentation is not broken. If you add a new parameter or remove one, the diff for code review will show as a single line change per parameter modified:

 # Aligned with opening delimiter.
 foo = long_function_name(
     var_one, 
     var_two,
-    var_three, 
     var_four=4,
+    another_var=5,
 )

Very nice and easy to identify changes when you do side by side comparison.

Line continuation and breaks

In Python if a line is too long you can use \ to continue in the next line. In my personal taste this looks kind of ugly and you always need to keep in mind all those breaks in imports for example:

from django.db.models.expressions import Case, Exists, Expression, \
    ExpressionList, ExpressionWrapper, F, Func, OuterRef, RowRange, Subquery, \
    Value, ValueRange, When, Window, WindowFrame

If you have to delete or rename things there, the indentation length of the lines will look funny and you may have to wrap the whole text and re-arrange all the backslashes.

So why not to use parenthesis as a grouping mechanism, like in math:

from django.db.models.expressions import (
    Case, Exists, Expression, ExpressionList, ExpressionWrapper, F, Func,
    OuterRef, RowRange, Subquery, Value, ValueRange, When, Window, WindowFrame,
)

Or also:

from django.db.models.expressions import (
    Case, 
    Exists, 
    Expression, 
    ExpressionList, 
    ExpressionWrapper,  
    F, 
    Func,
    OuterRef, 
    RowRange, 
    Subquery, 
    Value, 
    ValueRange, 
    When, 
    Window, 
    WindowFrame,
)

I don't like this last one because is too long and usually imports don't change that often, but this last style is also acceptable and has all the advantages of the example 1 I showed above, is refactoring proof.

When calling something too long

posts = (
   Posts.objects
   .exclude(author=user)
   .filter(title__startswith='hellow')
   .first()
)

When having a long if:

if (not previous_is_grouped
        and not prev.has_two_images
        and nextone
        and not nextone.has_two_images):
    do_evil_stuff()

Here I don't like the break in continuation between the first line of the if and the rest but it's fine.

Also when doing long list comprehension:

double_of_evens = [
    x * 2
    for x in range(100)
    if x % 2 == 0
] 

The lines are for: mapping, iteration and filtering.

Closing

General style guidelines for a language and a whole community are amazing and I love it. But don't be blind, if you really think there is a real reason to do things a little bit better take that step.

Think about all the abandoned code of people that had to flee buildings in flames. LOL 😂.

Posted on by:

edelvalle profile

Eddy Ernesto del Valle Pino

@edelvalle

Cuban software writer living in Berlin, working in the renewable energy space.

Discussion

markdown guide
 

Yeah, PEP8 is a little limited. Fortunately there are automated ways to go beyond that.

My favorite tool is black. I think it's better than the pep8 formatter (which as you highlighted is limited) and yapf (which is sometimes weird).

You can integrate it with pretty much any tool (editor, CI, git hooks...) and forget about most of those issues. Give it a try!

 

Worst disadvantage of black? It forces double quotes on you unless you disable quote management altogether.

 

Surrender to the robots :-)

Prettier for JS defaults to double quotes
Rubocop for Ruby defaults to double quotes
black for Python defaults to double quotes and gives a good reason for it:

The main reason to standardize on a single form of quotes is aesthetics. Having one kind of quotes everywhere reduces reader distraction. It will also enable a future version of Black to merge consecutive string literals that ended up on the same line (see #26 for details).

Why settle on double quotes? They anticipate apostrophes in English text. They match the docstring standard described in PEP 257. An empty string in double quotes ("") is impossible to confuse with a one double-quote regardless of fonts and syntax highlighting used. On top of this, double quotes for strings are consistent with C which Python interacts a lot with.

Rubocop defaults to single quotes because double quotes do more escaping and interpolation.

docs.rubocop.org/en/latest/cops_st...

Thanks Clayton, I remembered it wrongly!

 

I didn't know... Will take a look into it.. even if it enforces double quotes... hahahaha

 

Does this apply to string literals?
lets say you have long strings like

err_obj = {
               "status": 400,
               "error": "images should be in uri format(http://img.png)"
         }
 

I would write that as:

error = {
   "status": 400,
   "error": "images should be in uri format(http://img.png)"
}

Considerations:

  1. 4 spaces indentation.
  2. Closing bracket at the same level of indentation of the line containing the opening bracket.
  3. Name, err_obj, everything in Python is an object, naming an object "object" does not explains what's the intention or meaning, like naming a view "view" or class "class". So that lefts us with err, and is not much to ask to write a full world: "err" -> "error", specially if this one is short.

Remember we think we are most of the time power typing, but that's not true, in reality most of the time we are reading code, staring into the abyss. Make your code clear, explicit, avoid useless redundancy and don't leave chance for miss interpretation.

 

Thanks I'll make changes to my code.

However, I was asking if you have a really long string e.g. my error object, or a regex string. Is there a way to handle exceptionally long strings without using \ newline ?

Yes, you can do something like....

some_stirng = (
    "Bla bla bla bla. "
    "More bla bla bla bla. "
    "More bla bla bla bla"
)

Or use multi-line strings, but usually they are inconvenient because they will get the spaces from the indentation.

But keep in mind that this is about taste mostly and subjective, so do what ever you feel fits you and your team...

 

Recently I was writing a contributing guide for a medium project of my work. There are some Python newbies (I consider myself a newbie as well), and I need explain them some coding guidelines. Definitely I will send them this article. Greetings from Holguín, Cuba.

 

A pleasure Ozkar, long time didn't know about you!

 

That's some general good style practice, but I'm going to have to disagree on the mega long import. What's wrong with just importing the module you want to use (from django.db.models import expressions)? That way you can more easily avoid name clashes.

 

I'm fine also with that... this was just to illustrate