DEV Community

Cover image for Why Python Generators?
Muhammad Syuqri
Muhammad Syuqri

Posted on

Why Python Generators?

In the usual use case of iterating over an iterable object such as a list, string, or dict, we can simply use a for loop to go through each element.

my_numbers = [ x for x in range(5) ]

for num in my_numbers:
    print(num)  # prints numbers from 0 to 4

We can also achieve a similar result using generators. To instantiate a generator, one can simply change the [] to () like so:

my_numbers_gen = ( x for x in range(5) )  # this is now a generator

print(my_numbers_gen)
# ouput >>> <generator object my_numbers_gen.<locals>.<genexpr> at 0x7f53ebcb1f20>

for num in my_numbers_gen:
    print(num)  # prints numbers from 0 to 4

We can see that my_numbers_gen is of the type generator object. When we iterate through the generator object and the list, we can see that there is no difference in the output. Why would you then want to use a generator?

Reasons to use a generator

Reduce memory usage when iterating over large objects

When we iterate over a large list, say with 1 million elements, we can use memory profiler to see that the memory usage jumps from 13.992 MiB to 52.895 MiB.

syuqri@pop-os:~/Documents$ python3 -m memory_profiler test.py 
Filename: test.py

Line #    Mem usage    Increment   Line Contents
================================================
     2   13.992 MiB   13.992 MiB   @profile
     3                             def test_func():
     4   52.895 MiB    0.309 MiB       test_list = [x for x in range(10**6)]
     5   52.895 MiB    0.000 MiB       for i in test_list:
     6   52.895 MiB    0.000 MiB           pass

On the other hand, the generator version of this sees no increment in the memory usage. This is because an iterator does not load all of its values into memory at once. Instead, it loads one value and then proceeds to the next.

syuqri@pop-os:~/Documents$ python3 -m memory_profiler test_gen.py 
Filename: test_gen.py

Line #    Mem usage    Increment   Line Contents
================================================
     1   14.000 MiB   14.000 MiB   @profile
     2                             def test_func_gen():
     3   14.000 MiB    0.000 MiB       test_gen = (x for x in range(10**6))
     4   14.000 MiB    0.000 MiB       for i in test_gen:
     5   14.000 MiB    0.000 MiB           pass

You need to use intermediate values from a function

In a typical function, you can have multiple return statements, but ultimately, only one of those return statements holds true.

def normal_func():
    if True:
        return "hello"
    return "goodbye"

test = normal_func()
print(test)
>>> hello

When we run the above snippet, we can see the output is always "hello". What if we want both strings to be printed out?

Well, that is where generators can come in handy. The major difference change needed to the above function to make it a generator is to use yield instead of return.

def gen_func():
    if True:
        yield "hello"
    yield "goodbye"
test = gen_func()
print(next(test))
print(next(test))
>>> hello
>>> goodbye

Since the function is a generator, we can iterate through the different yield statements of the function using next().

References

Top comments (0)