Data is often organized using tree like structures; JSON, XML, and File Directories are good examples of this. In this post, I would like to talk about how the keyword yield
(in combination with recursion) can be used to iterate easily through complex1 tree like structures.
1 By complex, I mean multi-typed and messy.
The yield
Statement
For those who are not familiar with yield
statements in python, here's a quick explanation.
The yield
statement allows a function to return a value and then continue running. In practice, this means turning the function into an iterator. For example:
def foo():
n = 0
while True:
yield n
n += 1
my_iter = foo()
my_iter.next()
> 0
my_iter.next()
> 1
my_iten.next()
> 2
...
We can also use a for loop.
my_iter = foo()
for n in my_iter:
print(n)
> 0
> 1
> 2
...
Recursion
Recursion occurs when a function is executed inside of itself. This idea is especially helpful in relation to self similar structures, like trees. However, that function needs an exit point in order to actually be useful. Otherwise, it would run forever and return nothing. Let's look at some examples.
This function will run forever.
def foo(x):
return 1 + foo(x)
This function will not run forever. Notice that it has an exit point, which is used when x >= 5
.
def bar(x):
if x < 5:
return var(x+1)
else:
return x
This function prints all of the items of a nested array.
def print_nested_array(x):
if isinstance(x, list):
for item in x:
print_nested_array(item)
else:
print(x)
That one's a little more complicated; let's talk about it.
The first value that's passed through this function is probably going to be an array (since we called the function print_nested_array
). In that case, the function will loop through that array and pass each of its items back into the function.
For each item that is not an array, the function will print its string representation to the console. That's our exit point.
If one of those items is also an array, it will go through the same process the initial array went through. Each of its items will be passed back into the function.
Let's test it.
arr = [ [1,2], 3, [[4], 5, 6], 7]
print_nested_array(arr)
1
2
3
4
5
6
7
Using Recursion and yield
The yield
statement allows us to pass each of the items in our nested array back to the first instance of our function (the one that was not called using recursion), making it easier for us to access those values later on.
To implement this, we'll only need to do few things to the previous function we used (shown below for convenience).
def print_nested_array(x):
if isinstance(x, list):
for item in x:
print_nested_array(item)
else:
print(x)
First, we'll replace the print statement with a yield
statement.
Then, since our function now returns an iterator, we'll also have to handle that inside our for loop. In this case, instead of just calling our function again, we'll use another for loop to iterate through the iterator it returns. Each of the items in that loop will also be yielded.
def parse_array(x):
if isinstance(x, list):
for item in x:
for result in parse_array(item):
yield result
else:
yield x
Let's test it.
arr = [
"foo",
"bar",
["asdf", 34],
[[], [1, 2]],
[3, 4, 5, 6, 7],
True
]
for item in parse_array(arr):
print(item)
foo
bar
asdf
34
1
2
3
4
5
6
7
True
We could have done that with our previous function. However, we can also cast the iterator the function returns into a list.
list( parse_array(arr) )
["foo", "bar", "asdf", 34, 1, 2, 3, 4, 5, 6, 7, True]
Let's modify the function so we can print a nicely formatted representation of the array.
def print_array(x):
if isinstance(x, list):
for item in x:
for result in print_array(item):
yield " "+str(result)
else:
yield str(x)
for row in print_array(arr):
print(row)
foo
bar
asdf
34
1
2
3
4
5
6
7
True
What if our array also had dictionaries in it? What if it was a dictionary? We can add an elif
statement to handle that.
def parse_something(x):
if isinstance(x, list):
for item in x:
for result in parse_something(item):
yield result
elif isinstance(x, dict):
for key in x:
for result in parse_something(x[key]):
yield result
else:
yield x
To print a nicely formatted version of our array/dictionary, we could use this function.
def print_something(x):
if isinstance(x, list):
for item in x:
for result in print_something(item):
yield " "+str(result)
elif isinstance(x, dict):
for key in x:
yield ""
yield key+":"
for result in print_something(x[key]):
yield " "+str(result)
else:
yield str(x)
Let's test it.
my_dict = {
"key1": 1,
"key2": [1, 2, {"foo": 42}],
"bar": {
"a": 1234,
"b": 5678
}
}
print_something(my_dict)
key1:
1
key2:
1
2
foo:
42
bar:
a:
1234
b:
5678
Conclusion
The combination of recursion and the yield
statement is pretty powerful. I'm definitely going to keep it in the back of my head when I'm programming.
I hope you've also found this post to be helpful (or at least a little interesting).
— jsimonrichard
Top comments (0)