DEV Community

Christian Selig
Christian Selig

Posted on

My (Least) Favorite Python Quirks

A couple language ergonomics issues that have given me trouble in my few years of writing Python.

Declarations are executed

This is one that makes sense after you think about what's happening, but I think it's easy to get wrong the first time.

One example is the following:

class MyClass:
  def __init__(self, data={}):
    self._data = data

  def set(self, k, v):
    self._data[k] = v

  def get(self, k):
    return self._data.get(d)

  def remove(self, k):
    self._data.pop(k, None)

  def get_count(self):
    return len(self._data.keys())

A = MyClass()
A.get_count() # 0
A.set('a', 1)
A.get_count() # 1

B = MyClass()
B.get_count() # 1

A.remove('a')
A.get_count() # 0
B.get_count() # 0
Enter fullscreen mode Exit fullscreen mode

If you experimented further, you would see that the contents of _data is being somehow kept in sync between both instances of MyClass.

But interestingly:

C = MyClass({})
C.get_count() # 0
Enter fullscreen mode Exit fullscreen mode

What happened is that Python gets class and function definitions by executing a file line by line, which means that the default value of data in the __init__ method is actually instantiated. All instances of the class that use the default value mutate and read from the same object, effectively making it a class variable.

This is confirmed by:

A._data is B._data # True
Enter fullscreen mode Exit fullscreen mode

Avoiding this involves doing something obtuse like:

class MyClass:
  def __init__(self, data=None):
    if data is None:
      self._data = {}
    else:
      self._data = data

...
Enter fullscreen mode Exit fullscreen mode

Scope of loop variables

Unlike many languages, "loop variables" in Python stay in the local scope after the loop is exited. This can lead to some unexpected behavior:

def fun():
  i = 0
  arr = [1, 2, 3]

  # lots of code

  for i in range(len(arr)):
    # do some stuff

  # more code

  return arr[i]

fun() # returns 3
Enter fullscreen mode Exit fullscreen mode

While this example is pretty contrived, the general idea is that this behavior of loop variables can lead to weird and hard-to-track-down bugs (including this example of shadowing previously declared local variables).

Another complication: if there's nothing to loop over, loop variables won't be assigned at all. For example the first line of main() below runs without error but the second line will cause NameError: name 'x' is not defined to be raised.

def fun(l):
  for x in l:
    print(x)
  return x

def main():
  fun([1, 2, 3])
  fun([])
Enter fullscreen mode Exit fullscreen mode

Even other interpreted and non-curly-brace languages avoid these troubles, e.g. Ruby:

3.times do |x|
  print x
end

x # NameError (undefined local variable or method `x' for main:Object)
Enter fullscreen mode Exit fullscreen mode

Function chaining is hard

When it comes to working with data structures, Python's provided functions and methods aren't very consistent. Take lists for example: the List class has methods that change it in-place, but for many operations you'll need to use list comprehensions which return new lists and have a unique syntax. For other operations you might want to use filter and map, which are functions (not methods of the list class). And there are other scattered inconsistencies, for example the fact that join, a function for working with lists, is a string method instead of a list method.

As an example:

def slugify(string, bad_words):
  words = string.split()
  words = [w.lower() for w in words if w.isalpha() and w not in bad_words]
  return "-".join(words)

slugify("My test 1 string", set(["dang", "heck"])) # "my-test-string"
Enter fullscreen mode Exit fullscreen mode

For people familiar with Python this probably looks natural, but even still the code requires careful reading to understand each transformation the function is doing (and this is for a simple example).

In other languages this kind of code is more natural. Ruby for example has more consistent methods that allow for chaining:

def slugify(string, bad_words)
  string.split()
    .map(&:downcase)
    .select { |w| w.match(/^[[:alpha:]]+$/) }
    .select { |w| !bad_words.include? w }
    .join("-")
end

slugify("My dang test 1 string", Set["dang", "heck"]) # "my-test-string"
Enter fullscreen mode Exit fullscreen mode

As an example from the functional world, Clojure strongly supports function chaining with consistent APIs and the threading macros -> and ->>.

(defn slugify [string bad-words]
  (->> (str/split string #" ")
    (filter #(re-matches #"^[a-zA-Z]*$" %))
    (remove bad-words)
    (map str/lower-case)
    (str/join "-")
  )
)

(slugify "My dang test 1 string" (set '("dang" "heck"))) ; "my-test-string"
Enter fullscreen mode Exit fullscreen mode

Top comments (0)