DEV Community

Cover image for A gentle introduction to Python generators
Dandy Vica
Dandy Vica

Posted on

A gentle introduction to Python generators

When I first met Python generators, I found them quite obscure and not easy to understand. I didn't find a clear introduction to them, or maybe I didn't search too much. That's why I wrote this article, to go directly to the gist of those Python beasts.

Python generator definition

A Python generator is:

  • a Python function or method
  • which acts as an iterator
  • which keeps track of when it's called (stateful)
  • and returns data to its caller using the yield keyword

A simple example to start

Consider this function:

def generator1():
    yield 1
    yield 2
    yield 3
Enter fullscreen mode Exit fullscreen mode

Calling this function directly simply returns a generator object:

>>> generator1()
<generator object generator1 at 0x7fac361d8bf8>
Enter fullscreen mode Exit fullscreen mode

The iter and next() methods are automatically implemented:

>>> gen1 = generator1()
>>> dir(gen1)
['__class__', '__del__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__lt__', '__name__', '__ne__', '__new__', '__next__', '__qualname__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'close', 'gi_code', 'gi_frame', 'gi_running', 'gi_yieldfrom', 'send', 'throw']
Enter fullscreen mode Exit fullscreen mode

and usable:

>>> next(gen1)
1
>>> next(gen1)
2
>>> next(gen1)
3
>>> next(gen1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration
>
Enter fullscreen mode Exit fullscreen mode

The StopIteration is returned because the generator function has been called 3 times and there's nothing more to yield.

Being iterable, it's directly callable using built-in functions like list():

>>> gen1 = generator1()
>>> list(gen1)
[1, 2, 3]
>>> list(gen1)
[]
Enter fullscreen mode Exit fullscreen mode

You can see that after the second call, the generator object has been exhausted and an empty list is returned.

Same with tuple() or set() built-in functions:

>>> gen1 = generator1()
>>> tuple(gen1)
(1, 2, 3)
>>> gen1 = generator1()
>>> set(gen1)
{1, 2, 3}
Enter fullscreen mode Exit fullscreen mode

Of course the for in construct is available here:

gen1 = generator1()

# this will print out 1,2,3
for i in gen1:
    print(i)
Enter fullscreen mode Exit fullscreen mode

Moving beyond

The above example was only meant to make you understand the yield mechanism.

We can go beyond, passing parameters to the generator function:

# returns a Fibonacci number < fib_max
def fibonacci1(fib_max: int) -> int:
    # initial values
    fib_n_2 = 0
    fib_n_1 = 1
    yield fib_n_2
    yield fib_n_1

    # now general case
    fib_n = fib_n_2 + fib_n_1
    while fib_n <= fib_max:
        yield fib_n
        fib_n_2 = fib_n_1
        fib_n_1 = fib_n
        fib_n = fib_n_2 + fib_n_1

# gives: [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987]
print(list(fibonacci1(1000)))
Enter fullscreen mode Exit fullscreen mode

Note this is NOT a recursive function. It just stops at yield kind of breakpoints, which is both memory and stack efficient.

It's easy to modify the previous function to yield infinite values:

from itertools import islice

# returns an infinite sequence of Fibonacci numbers
def fibonacci2() -> int:
    # initial values
    fib_n_2 = 0
    fib_n_1 = 1
    yield fib_n_2
    yield fib_n_1

    # now general case
    fib_n = fib_n_2 + fib_n_1
    while True:
        yield fib_n
        fib_n_2 = fib_n_1
        fib_n_1 = fib_n
        fib_n = fib_n_2 + fib_n_1

# gives: [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987]
print(list(islice(fibonacci2(), 17)))
Enter fullscreen mode Exit fullscreen mode

Python generator expressions

Generator expressions are akin to list comprehensions, at least when comparing to the syntax.

They are used to create generator objects with a simple expression rather than a function, but they are less flexible and less powerful:

# first 100 squares
squares_gen = (x*x for x in range(100))

# only created here
squares = list(squares_gen)
Enter fullscreen mode Exit fullscreen mode

They are lazily evaluated, meaning there are executed only when it's necessary.

Hope this helps !

Top comments (2)

Collapse
 
orenovadia profile image
orenovadia

Wish I had seen this when I was just starting with Python.

Trying to iterate twice over an exhausted generator got me many times...

Collapse
 
dandyvica profile image
Dandy Vica

Thanks, it was meant exactly for that purpose!