UponTheSky

Posted on Apr 16, 2023

[Python]Tips for writing robust Python code: functions

#python #webdev #productivity #machinelearning

Tips for writing robust Python code: functions

Remark: This article is a collection of short tips, and would be updated in the future

Besides: Why Robustness?

This is my personal opinion on why we need robustness. It may seem too obvious, so if you're not interested or have heard such stories too often, please skip and read only those tips

Many people from statically-typed, ("relatively")hard-to-learn languages such as C/C++ or Java could think Python as a simple tool for writing menial tasks or just a wrapper of low-level frameworks such as Pytorch or Numpy. Even if it is true, still, the community is already as big as that of JavaScript/TypeScript, and in the era of Machine Learning, many people now have to write a big-scaled project in Python.

Here is where "Robustness" comes into play. You'll know why TypeScript is now a de-facto in the web industry, because everything is on the Internet nowadays(even writing backend applications like Nest.js.

The same story applies to Python, since we're living in an era of Chat GPT. Many people now need to run machine learning algorithms on their server, and it is Python who has been chosen as a script language abstracting those hard parts written in C and C++.

But at the same time, we also know that most of the tutorials on the Internet out there for first-programming learner don't care even a little bit about such thing as "robustness". And since many people choose Python as their first language these days, Python is still widely recognized as easily-writable, just for hobby or for education(although this is what Guido intended), and doing only simple things. These preconceptions make people more detached from writing robust Python code.

However, if you're like me, who's trying to be a Python expert, I think you should know at least the following tips I'll introduce so that you won't suffer from maintaining big messy Python chunks almost aways giving bugs.

Small tips for robust Python functions

So how do we write robust Python code? I won't cover any design patterns, architectures, or OOP design principles. There are lots of good books and Internet materials out there. But here I'd like to introduce a few tips that is Python-specific and easy to follow without spending too much time on texts.

1. Type Annotation is unavoidable.

I already mentioned TypeScript as a substitute for plain JavaScript several times. Although we don't have such thing as Tython, but since Python 3.5 we have type hints in the form of type annotations, just like other statically-typed languages such as Java.

However, be aware that they are just "hints", and even if your IDE complains about not matching type, the Python interpreter will still read your lines of code without any runtime errors. Nevertheless, we can at least recognize there would be possibly some problems and this will give us huge benefit of maintaining our projects.

def type_wrong(arg: int) -> int:
  return f"Arg {arg} is not int, and the return value is not int,\
 but this still produces results"

if __name__ == '__main__':
  print(type_wrong(1))

This function still gives a result string

There will be some people arguing that Python already has some kind of type hints: Docstring. Yes, I was the one who used to adopt this technique(I loved Numpy Style). However,

you have to choose a certain Docstring style among lots of options,
IDE wouldn't recognize type hints written in Docstrings(do you know of any such tool that helps you to parse Docstrings so that you know the type hints of a function?)
it is way harder to read Docstrings to recognize types whereas you can immediately see what types there are with help from the IDEs

Thus, I would strongly suggest to use type annotations, whenever you use Python not as a substitute for Shell script.

def type_annotation(arg1: str, arg2: int) -> str:
  return f"{arg1} is a string and {arg2} is an integer"

def docstring(arg1, arg2):
  """
  Args:
    - arg1: str
    - arg2: int

  Returns: str
  """
  return f"{arg1} is a string and {arg2} is an integer"

Obviously, type annotation is way more easier to be read

2. ONLY use keyword arguments

This is another point that many people usually neglect. In compiled languages such as C++ or Java, they throw errors if you put a value in a wrong place as an argument for a function. Or any modern languages such as Swift, you are required to specify the name of an argument of a function when you call that function. So in either case, you are forced to know that you are providing values of correct types.

In case of Python, you can choose either positional arguments or keyword arguments for your functions. But since Python will not throw any error before you actually run that function in a script(unlike compiled languages), and Python does not enforce you to specify arguments when calling a function(unlike modern programming languages), your code is prone to mistakes and errors. If your function only have arguments of two or three, you probably won't make a big mistake. But say you've got six or more arguments, then it is really hard to remember the position of every argument and its expected type.

Well, it is not the case. What I meant by doesn't is that you can still write a function solely using positional arguments. However, you can enforce your function to only have keyword arguments. You simply need to put your arguments after placing an asterisk *.

def keywords_only(*, arg1: int, arg2: str) -> int:
  # ...


if __name__ == '__main__':
  keywords_only(arg1=42, arg2="is good")

This is actually not my idea but Sandi Metz's which I read from her OOD book. Not only does it reduce mistakes in calling functions, but also reveals sufficient information about the function(as a documentation).

remark: there are some Python-specific techniques involved in positional/keyword arguments(including /, *args, **kwargs). If you are someone who prefer books to get information, I strongly recommend this book's Python function part.

3. Make your function "functional"

Here, the word "functional" doesn't convey some serious meaning. If you recall what you learned about "function" in your middle school time, you'll agree that it is just a simple input -> output mechanism. We put objects into the function of our interest, and in return, we get output as a result we expect from it.

Although we can derive many principles of writing good functions from this simple definition of a function, the following is what I try to emphasize the most:

A function must try to follow this "input -> output" structure, i.e. it should not have any side effects. However, if that is unavoidable, try to restrict such occurrences to specific locations.

Why not having side effects? Obviously, side effects often beyond our control. Unless we're aware of them, they could make changes in unexpected places of our product, which could definitely lead to serious bugs and damages.

This is especially important for Python programmers, because every non-primitive object(like custom class instance, list, set, etc.) in Python is passed to a function as its reference. That means, it is really easy for someone to make side effects inside a Python function(this is almost the same when you write code in another languages that doesn't have pointer concepts, such as Ruby or JavaScript).

So for example, this type of mistakes happen a lot in Python:

Suppose you keep price data of a few fruits.

class Fruit:
  def __init__(self, *, name: str, price: float):
    self._name = name
    self._price = price

  @property
  def name(self):
    return self._name

  @property
  def price(self):
    return self._price

  @price.setter
  def price(self, new_price: float):
    self._price = new_price

fruits_database = {
  "banana": Fruit(name="banana", price=10),
  "apple": Fruit(name="apple", price=5)
}

But for some reason, you're asked to double the original price of banana. You then write a function called update_fruit_price.

def update_fruit_price(*, fruit: Fruit, new_price: float) -> Fruit:
  fruit.price = new_price
  return fruit

if __name__ == "__main__":
  banana = fruits_database.get("banana")
  assert banana is not None

  updated_banana = update_fruit_price(
    fruit=banana, 
    new_price=banana.price * 2
  )

So far so good? I guess not. So if you check whether the result updated_banana and fruits_database.get("banana"), the result should be as the following:

print(updated_banana is fruits_database.get("banana"))
# True

So unless you're determined to change the database itself, you're now in a big trouble since there is no way to roll back your change. What if you're asked to return to the previous price before the change?

All of these unexpected consequences occurred because the fruit object is passed by reference to update_fruit_price function, and we didn't recognize its side effect - change the object in the database itself.

Now that we know what's happening here, we will revise our code so that we stick to this input -> output structure.

make update_fruit_price "functional": no side effect
implement another function save_fruit: restrict the place where you make side effects

Now the code is improved as below:

def update_fruit_price(*, fruit: Fruit, new_price: float) -> Fruit:
  return Fruit(name=fruit.name, price=new_price)

FruitsDatabase = dict[str, Fruit] # type definition of our database for better code readability

def save_fruit(*, fruit: Fruit, database: FruitsDatabase) -> None:
  database[fruit.name] = fruit

if __name__ == "__main__":
  banana = fruits_database.get("banana")
  assert banana is not None

  updated_banana = update_fruit_price(
    fruit=banana, 
    new_price=banana.price * 2
  )
  assert banana is not updated_banana 
  # now the updated result is a temporary object

  save_fruit(fruit=updated_banana, database=fruits_database)

  assert fruits_database.get("banana") is updated_banana 
  # the updated object is now stored in the database

This practice can be said to be following the single responsibility principle as well, separating our concerns about updating and saving the banana data. However, we don't need to remember that kind of fancy name at all. Just keep in mind that a function should keep its input -> output structure as much as possible.

remark: if you have experience in React development, this would be familiar to you - immutability of objects. You will see more sophisticated and mature principles from Functional Programming paradigm

4. Actively use decorators

The last tip is rather simpler one than before. I know many renowned Python frameworks utilize this decorator functionality extensively, so almost all of Python programmers must be familiar with it. However, I doubt that many Python programmers use their own custom decorators in their production code as well.

Decorators in Python are just syntactic sugar for function wrappers.

from typing import Callable, Any

def example_decorator(function: Callable) -> Callable:
  def _wrapped(*args, **kwargs) -> Any:
    # Do what you want here
    print("hey!!! I am called!!!")
    return function(*args, **kwargs)

  return _wrapped

# this is the same as example_decorator(example)

@example_decorator
def example() -> None:
  print("yes!")

Of course, there are more details I haven't mentioned yet, such as functools.wraps or decorators with arguments(just like @pytest.mark.parametrize(...)). But overall, I want to emphasize that decorator is a mere syntactic sugar and nothing more.

So why do we need to implement our own custom decorators? Although it seems to be making our code more complicated, but in fact, it reduces unnecessary repetition in our code.

For example, if you want to measure the execution time of a function, you would probably do as below:

import time

def example() -> None:
  time.sleep(3)

if __name__ == "__main__":
  before = time.time()
  example()

  after = time.time()

print(f"The execution time of {example.__name__} is {after - before} seconds")

But imagine you measure more than one function, or maybe do the same job in the future. Are you still going to type in before = time.time() every time? And what if you change the measuring part - say you measure in milliseconds rather than in seconds. So we're facing the DRY(don't repeat yourself) problem. You should not repeat your code unless you really have no choice.

Here we have our decorator:

from typing import Callable, Any
import functools
import time
import logging

logger = logging.getLogger(__file__)
logger.setLevel(logging.INFO)

def measure_time(function: Callable) -> Callable:
  @functools.wraps(function)
  def _wrapped(*args, **kwargs) -> Any:
    before = time.time()
    return_value = function(*args, **kwargs)
    after = time.time()

    logger.info(
      f"The function {function.__name__} is executed \
        for {after - before} seconds"
    )
    return return_value

  return _wrapped

@measure_time
def example() -> None:
  time.sleep(3)

if __name__ == "__main__":
  example()

Now you can use this measure_time decorator in any function you want to measure, while being able to change freely how you measure them only in this single place.

This is a simple example, but you can find another use cases as well, such as wrapping function within an error handling structure.

Conclusion

The above tips are just tips. You might find a few of them useful for your own product, or maybe not. However, robustness of code is one you must make sure to have. Simple, clear, readable, and maintainable code reduces your frustration against your project and provides happier developer experience.

DEV Community

[Python]Tips for writing robust Python code: functions

Tips for writing robust Python code: functions

Besides: Why Robustness?

Small tips for robust Python functions

1. Type Annotation is unavoidable.

2. ONLY use keyword arguments

3. Make your function "functional"

4. Actively use decorators

Conclusion

Top comments (0)

Read next

Overcoming SME Challenges with Custom Deep Learning Solutions

Code. Gleam. Extract fields from JSON

Daily JavaScript Challenge #JS-79: Find the Majority Element in an Array

Birthday Cake Candles - HackerRank Problem Solving