Martin Heinz

Posted on Aug 6, 2019 • Edited on Jan 5, 2020 • Originally published at martinheinz.dev

Python tips and trick, you haven't already seen

#python

Note: This was originally posted at martinheinz.dev

There are plenty of articles written about lots of cool features in Python such as variable unpacking, partial functions, enumerating iterables, but there is much more to talk about when it comes to Python, so here I will try to show some of the features I know and use, that I haven't yet seen mentioned elsewhere. So here we go.

Sanitizing String Input

Problem of sanitizing user input applies to almost every program you might write. Often it's enough to convert characters to lower or upper-case, sometimes you can use Regex to do the work, but for complex cases, there might be a better way:

user_input = "This\nstring has\tsome whitespaces...\r\n"

character_map = {
    ord('\n') : ' ',
    ord('\t') : ' ',
    ord('\r') : None
}
user_input.translate(character_map)  # This string has some whitespaces...

In this example you can see that whitespace characters "\n" and "\t" have been replaced by single space and "\r" has been removed completely. This is a simple example, but we could take it much further and generate big remapping tables using unicodedata package and its combining() function to generate and map which we could use to remove all accents from string.

Taking Slice of an Iterator

If you try to take slice of an Iterator, you will get a TypeError, stating that generator object is not subscriptable, but there is a easy solution to that:

import itertools

s = itertools.islice(range(50), 10, 20)  # <itertools.islice object at 0x7f70fab88138>
for val in s:
    ...

Using itertools.islice we can create a islice object which is an iterator that produces desired items. It's important to note though, that this consumes all generator items up until the start of slice and also all the items in our islice object.

Skipping Begining of Iterable

Sometimes you have to work with files which you know that start with variable number of unwanted lines such as comments. itertools again provides easy solution to that:

string_from_file = """
// Author: ...
// License: ...
//
// Date: ...

Actual content...
"""

import itertools

for line in itertools.dropwhile(lambda line: line.startswith("//"), string_from_file.split("\n")):
    print(line)

This code snippet produces only lines after initial comment section. This approach can be useful in case we only want to discard items (lines in this instance) at the beginning of the iterable and don't know how many of them there are.

Functions with only Keyword Arguments (kwargs)

It can be helpful to create function that only takes keyword arguments to provide (force) more clarity when using such function:

def test(*, a, b):
    pass

test("value for a", "value for b")  # TypeError: test() takes 0 positional arguments...
test(a="value", b="value 2")  # Works...

As you can see this can be easily solved by placing single * argument before keyword arguments. There can obviously be positional arguments if we place them before the * argument.

Creating Object That Supports `with` Statements

We all know how to, for example open file or maybe acquire locks using with statement, but can we actually implement our own? Yes, we can implement context-manager protocol using __enter__ and __exit__ methods:

class Connection:
    def __init__(self):
        ...

    def __enter__(self):
        # Initialize connection...

    def __exit__(self, type, value, traceback):
        # Close connection...

with Connection() as c:
    # __enter__() executes
    ...
    # conn.__exit__() executes

This is the most common way to implement context management in Python, but there is easier way to do it:

from contextlib import contextmanager

@contextmanager
def tag(name):
    print(f"<{name}>")
    yield
    print(f"</{name}>")

with tag("h1"):
    print("This is Title.")

The snippet above implements the content management protocol using contextmanager manager decorator. The first part of the tag function (before yield) is executed when entering the with block, then the block is executed and finally rest of the tag function is executed.

Saving Memory with `slots`

If you ever wrote a program that was creating really big number of instances of some class, you might have noticed that your program suddenly needed a lot of memory. That is because Python uses dictionaries to represent attributes of instances of classes, which makes it fast but not very memory efficient, which is usually not a problem. However, if it becomes a problem for your program you might try using __slots__:

class Person:
    __slots__ = ["first_name", "last_name", "phone"]
    def __init__(self, first_name, last_name, phone):
        self.first_name = first_name
        self.last_name = last_name
        self.phone = phone

What happens here is that when we define __slots__ attribute, Python uses small fixed-size array for the attributes instead of dictionary, which greatly reduces memory needed for each instance. There are also some downsides to using __slots__ - we can't declare any new attributes and we are restricted to using ones on __slots__. Also classes with __slots__ can't use multiple inheritance.

Limiting CPU and Memory Usage

If instead of optimizing your program memory or CPU usage, you want to just straight up limit it to some hard number, then Python has a library for that too:

import signal
import resource
import os

# To Limit CPU time
def time_exceeded(signo, frame):
    print("CPU exceeded...")
    raise SystemExit(1)

def set_max_runtime(seconds):
    # Install the signal handler and set a resource limit
    soft, hard = resource.getrlimit(resource.RLIMIT_CPU)
    resource.setrlimit(resource.RLIMIT_CPU, (seconds, hard))
    signal.signal(signal.SIGXCPU, time_exceeded)

# To limit memory usage
def set_max_memory(size):
    soft, hard = resource.getrlimit(resource.RLIMIT_AS)
    resource.setrlimit(resource.RLIMIT_AS, (size, hard))

Here we can see both options to set maximum CPU runtime as well as maximum memory used limit. For CPU limit we first get soft and hard limit for that specific resource (RLIMIT_CPU) and then set it using number of seconds specified by argument and previously retrieved hard limit. Finally, we register signal that causes system exit if CPU time is exceeded. As for the memory, we again retrieve soft and hard limit and set it using setrlimit with size argument and retrieved hard limit.

Controlling What Can Be Imported and What Not

Some languages have very obvious mechanism for exporting members (variables, methods, interfaces) such as Golang, where only members starting with upper-case letter are exported. In Python on the other hand, everything is exported, unless we use __all__:

def foo():
    pass

def bar():
    pass

__all__ = ["bar"]

Using code snippet above, we can limit what can be imported when using from some_module import *. For this specific example, wildcard import with only import bar. Also, we can leave __all__ empty and nothing will be exported causing AttributeError when importing from this module using wildcard import.

Comparison Operators the Easy Way

It can be pretty annoying to implement all the comparison operators for one class, considering there are quite a few of them - __lt__ , __le__ , __gt__ , or __ge__. But what if there was an easier way to do it? functools.total_ordering to the rescue:

from functools import total_ordering

@total_ordering
class Number:
    def __init__(self, value):
        self.value = value

    def __lt__(self, other):
        return self.value < other.value

    def __eq__(self, other):
        return self.value == other.value

print(Number(20) > Number(3))
print(Number(1) < Number(5))
print(Number(15) >= Number(15))
print(Number(10) <= Number(2))

How does this actually work? total_ordering decorator is used to simplify the process of implementing ordering of instances for our class. It's only needed to define __lt__ and __eq__, which is the minimum needed for mapping of remaining operations and that's the job of decorator - it fills the gaps for us.

Conclusion

Not all these features are essential and useful in day-to-day Python programming, but some of them might come in handy from time to time and they also might simplify task that would be otherwise quite lengthy and annoying to implement. Also I want to note that all those features are part of Python standard library, while some of them seem to me like pretty non-standard things to have in standard library, so whenever you decide to implement something in Python first go looking for it in standard library and if you can't find it, then you are probably not looking hard enough (if it's really not there, then it's surely in some third party library). 🙂

Oldest comments (3)

Michael Morehouse • Aug 12 '19

The first example can be made slightly more friendly with str.maketrans:

character_map = str.maketrans({
    '\n': ' ',
    '\t': ' ',
    '\r': None,
})

There's also a shorter form, though it helps to read help(str.maketrans) to see why it works:

character_map = str.maketrans("\n\t", "  ", "\r")

Rhet • Sep 15 '19

Thanks — very useful tips. FYI, the “Skipping beginning of iterable” example doesn’t behave as expected for me because the first line in the string is just a \n. To get it to behave as expected, I had to change your snippet to:

string_from_file = """// Author: ...
// License: ...
//
// Date: ...

Actual content...

"""

import itertools

for line in itertools.dropwhile(lambda line: line.startswith("//"), string_from_file.split("\n")):
    print(line)