I love ❤️ generators in PHP. They are like supercharged arrays that can preserve memory when used correctly. I've been using iterable
instead of array
type-hinting ever since I learned about them.
Generators are callback iterators
Generators are simple functions. But where a regular function will return
a single value or even void
, a generator can return multiple results. The only thing you have to do to change a function into a generator is to replace return
with yield
and call it.
A generator is an iterable
, meaning you have to loop over them in order to retrieve the results. You can simply foreach
over a generator, and it will return every yield
it encounters.
function exampleGenerator() {
yield 1;
yield 2;
yield 3;
}
$generator = exampleGenerator();
foreach ($generator as $value) {
echo $value;
}
// will echo out: 123
Notice that we actually call the function to return the generator. In this example it's pretty obvious we need to do that, but consider an anonymous function that is stored in the $generator
variable. You might accidentally try to iterate over that.
$generator = function() {
yield 1;
yield 2;
yield 3;
};
// Incorrect: $generator is now an uncalled function.
foreach($generator as $value) // ...
// Correct: $generator() is now a `Generator` object.
foreach($generator() as $value) // ...
Advantages of generators over arrays
While creating a function that yields 1,2,3 is very impressive; it's not really practical. So let's look at some reasons why you might consider using generators.
They are called when you start iterating
This might not seem like a big deal, but it actually is. Consider you have a ChoiceField
-object that has array $options
, and you have to retrieve the options from a database. When the field is rendered, it obviously needs to show those options. But when those options aren't rendered in that request, the database call will still be performed to instantiate the field.
When you change array $options
into iterable $options
and provide the options via a generator, the database call will only ever be executed if you foreach
over those options.
$options = function() {
foreach(DB::query('retrieve the options') as $option) {
yield $option;
}
};
$field = new ChoiceField($options());
So calling the function only returns the generator, but it will not execute until you start iterating.
Tip: If you already have an iterable result set, like an
array
or any otheriterable
, you can useyield from $resuls
. This will in essenceforeach
over all the results andyield
every one of them.
// Use `yield from` instead of looping the results.
$options = (function() {
yield from DB::query('retrieve the options');
})(); // Notice we called the function directly to return the generator.
// Or shorthand
$options = (fn() => yield from DB::query('retrieve the options'))();
They preserve memory
Besides not preforming any task without iterating, a generator only yields one result at a time, meaning it only has a single reference in memory at all times.
$options = (function() {
$results = DB::query('retrieve the options');
foreach($results as $result) {
// This way there is only one `Option` in memory at all times.
yield Option::createFromResult($result);
}
unset($results);
})();
In this example we retrieve a simple result set from a database query. Only when we yield
the result, we build up the Option
model that represents that result. This saves a lot of memory
Code can be executed after returning the results
You might have noticed that we casually called unset($results)
after we returned the results. This is because the generator will keep going until it no longer yields any results, unlike a return
statement where the function will end immediately. That's pretty awesome. This way you can even clean up some left over memory consumption after your generator finishes.
Keys can be reused
When you yield
a result, there is an implicit numeric 0-based key iterating the result. You can however yield both a key and a value by adding the =>
arrow.
// Without keys.
function fruits() {
yield 'apple';
yield 'banana';
yield 'peach';
}
foreach(fruits() as $key => $fruit) ... // Here key will be 0, 1, 2
// With keys.
function fruits() {
yield 'zero' => 'apple';
yield 'one' => 'banana';
yield 'two' => 'peach';
yield 'two' => 'lime';
}
foreach(fruits() as $key => $fruit) // Here $key will be 'zero',' one', 'two', 'two'
Noticed how we returned the same key twice? Unlike an array, this is no problem during the iteration. However, if you were to change the generator back into an array, by using iterator_to_array()
the key would be there only once, holding the last result for that key.
Things to consider when using generators over arrays
While generators behave very similar to arrays, they are not of the array
type. This means you can run into these caveats.
Array functions will not work with generators
PHP's array_
functions all require an actual array. So you cannot for example simply call array_map()
with your generator. To remedy this, you can use iterator_to_array()
to turn your generator into an array. This will however reintroduce the memory usage of arrays.
Tip: You might use
iterator_apply
to preform a callback on the yielded result, but this is not recommended as this function does not return an iterator itself or any of the results. It only performs a callback for every iteration, but the callback doesn't receive the result. You have to provide the iterator as an argument, and you can then retrieve thecurrent()
iteration. It's not worth it.
The count
of a generator is not predefined
Since we can yield
as many results as we want, and the generator only has one reference in memory at a time, it's not possible to count the results without traversing them. To ease this process you can use iterator_count()
. This will loop over every result and return the actual count.
A Generator instance can only be traversed once
When a generator finishes, it closes itself. Once this happens, you can't traverse it again. When you try to do so, you will run into this exception: Cannot traverse an already closed generator
.
A solution to this could be to call the generator function again. However, you should probably refactor your code to prevent this.
Note: iterator_count()
also closes the iterator, so you can't do a count and then loop. You should probably just keep a record of the count while iterating.
In conclusion
Obviously arrays have their time and place. I'd never use a generator to create a simple list. But whenever I'm working with objects or entity models, I'd like to use them to limit the memory usage.
Learned anything new? Don't keep it to yourself, but share it on social media! And if you have any questions or remarks let me know via twitter.
Top comments (0)