Roman Sarder

Posted on Sep 27, 2021

The saga of async JavaScript: Generators

#javascript #programming #webdev #patterns

Intro

One of the most complex things in modern JavaScript programs is asynchronicity. We have already taken a look at a couple of existing patterns such as Callbacks, Thunks, and Promises. Although they managed to solve a few key problems, all of these patterns have one major thing in common - they don't look like synchronous code. There has always been a difference between how we write and reason about our asynchronous code. This might sound like an unreal thing to wish for but time proved that we can get really close to it.

What we will learn

In today's article, we will talk about Generators. It's a new type of function introduced in ES6. At first, as we look at it, it will not be immediately obvious how it has anything to do with asynchronous programming. It will most likely seem weird to many of you. But as we slowly going through explanations and examples, we will eventually get to the point where it completely makes sense why we need them in our code. You will discover what makes Generators really stand out and what problems they solve for us. In the end, hopefully, you will be able to articulate about Generators with confidence and justify their usage in your code

Run-to-completion semantics

All normal functions in JavaScript have a common notable feature. When writing our synchronous code, we know that when our function starts executing it will always run to the end and finish before any other function gets a chance to execute. At any given second only one function is able to actively execute. That also means that nothing can pre-emptively interrupt our functions to run something else. The academic term that would perfectly describe all said above is run-to-completion semantics. This is what helps us to not worry about two functions interrupting each other or corrupt our shared memory. By having this "rule" in JavaScript we are able to reason about our code in a pure single-threaded fashion.

Generators are not like that

Generators are a very different type of thing. They don't meet this run-to-completion rule at all. On the surface, it should have brought quite a bit of chaos into our code. But it appears that they provide yet another way to solve our problems, although the way itself might look a bit strange. One way to explain Generators would be to say that in current JavaScript they let us define a state machine - a series of flow from one state to another state with an ability to declaratively list those transitions. I am sure that most of you created quite a few state machines and you might not even know that it is called this way. Previously, there were a lot of efforts and time involved in implementing state machines using available tools in JavaScript. We often used a closure to maintain a current and previous state in a function making all of those transitions, but the code was getting complex, and writing it was time-consuming as well. Generators are adding syntactic sugar which lets you solve the same problem in a lot easier and clear way. But how does that help with async code? To get there, we first need to get a good grasp on the internal plumbings of Generators.

Pausing with yield

Generators introduce a new keyword called yield and it acts a lot like a pause button. So when the generator function is running and it would come across a yield keyword it would demonstrate an interesting behavior. It does not matter where this yield is encountered. It might be even in the middle of an expression, but the generator will pause. From that point nothing will happen in a generator itself, it will stay completely blocked. It literally gets frozen. The important part is that the overall program itself is not blocked and can continue running. The block caused by yield is completely localized. And it can stay in this "paused" state indefinitely until somebody will come and tell it to continue running. You can think of a Generator as a function that can pause and resume as many times as necessary without losing any internal state.

An example

We now have to take a look at an example of Generator to see how all of these concepts stack together. Here is our first generator:

function* helloWorldGenerator() {
  console.log('Hello world');
  yield; // pausing
  console.log('Hello again!')
}

On line 1, the asterisk symbol tells the JavaScript that the function we are defining is indeed a generator. You will notice on line 3 we have our yield keyword which is our pause button. By using yield, the generator itself declares when, where, and in which manner it wants to pause. This is also called cooperative multitasking. Nobody on the outside can come in and interrupt its execution. This is what often causes catastrophes in multi-threaded languages. Fortunately, we don't have those.

Calling a Generator

When calling a Generator it behaves a bit different than other functions. Continuing with an example above, let's illustrate how we could use that generator:

const iterator = helloWorldGenerator();

iterator.next() // Hello world
iterator.next() // Hello again!

When we call the generator function, no code gets executed inside the generator itself. Executing a generator actually does not run any code. What's really happening is that we are getting an iterator. You probably know what are iterators, but just in case let's recall their definition. Iterator is a way of stepping through the set of data one result at a time. In this case, the purpose of the iterator is not to step through a collection of items, but to control our generator from the outside by literally stepping through these yield statements. Think of it as a handy API that helps us to control the flow of our generator. We can't pause a generator, but using an iterator we can ask it to run until it wants to pause itself. So on Line 1 none of the code runs, but on Line 2, by calling .next on the iterator object, we start the generator's execution. It will then execute console.log('Hello world') statement, pause itself on yield and return control back to the client's code. Whenever the next call to .next happens, it will resume the generator, execute the last console.log('Hello again!') statement and at this point, our generator is done.

Yielding values

It appears that in addition to yielding control to our code, generators are also able to yield values as well. In our previous example, we yielded nothing. Let's come up with a dummy example to showcase this point:

function* authorDossierGenerator () {
  const author = {
    name: "Roman",
    surname: "Sarder",
    age: 23,
  }

  yield author.name;
  yield author.surname;
  yield author.age;
}

const iterator = authorDossierGenerator();
iterator.next() // { value: "Roman", done: false }
iterator.next() // { value: "Sarder", done: false }
iterator.next() // { value 23, done: false }
iterator.next() // { value: undefined, done: true }

In the last example we assumed that generator yielded us an undefined, but now we are returning actual values. You will notice that each .next call gives us an object with value and done properties. The value corresponds to what we are yielding from the generator, in this case, it's a bunch of object property values. The done flag indicates whether the generator is complete or not. This might be tricky at the beginning. Our third iterator.next call visually might look like a generator is already done, but it is not. Although it is the last line in the generator, what really happens is that the generator is paused on the last expression which is yield author.age. If it is paused, it can be resumed and that's why only after the fourth .next we are getting done: false. But what about the last value being undefined? As with simple functions, if there is no return statement at the end of the generator, JavaScript assumes that it returns undefined. At any point, you can return from a generator and it will immediately complete itself as well as return a value if any. Think of return as an "Exit" button.

Passing values

We managed to illustrate that there is indeed a way for a generator to pass messages to the client's code. But not only we can yield messages out, but when calling the .next method we can also pass the message in and that message goes right into the generator.

function* sumIncrementedNumbers () {
  const x = 1 + (yield);
  const y = 1 + (yield);
  yield x + y
}

const iterator = sumIncrementedNumbers();

iterator.next() // { value: undefined, done: false } 
iterator.next(5) // { value: undefined, done: false }
iterator.next(2) // { value: 9, done: false }
iterator.next() // { value: undefined, done: true }

Notice that we placed our yield keywords in the middle of both expressions. From the inside perspective, think of those yields as question marks. When the generator gets to the first expression it basically asks a question: Which value should go here? Without an answer, it cannot complete an expression. At this point, it will pause itself and wait for somebody to provide this value. And we do that by calling .next and passing a value of 5. Now it can proceed to the next yield. Those yields act like placeholders for values that will at some point in time be passed to the generator and replace yield to complete an expression.

Converting to async

Right now, you should be ready to look at the following example and not have your head completely blown up. We are going to attempt to use Generators to work with asynchronous code and convert one of our previous examples. It might look a bit awful because of hoisting but think of it as a proof of concept. We will surely refactor into something that looks a lot nicer.

function getData (number) {
  setTimeout(() => {
    iterator.next(number);
  }, 1000)
}

function* sumIncrementedNumbersAsync() {
  const x = 1 + (yield getData(10));
  const y = 1 + (yield getData(20))

  console.log(x + y) // 32
}

const iterator = sumIncrementedNumbersAsync();
iterator.next();

Phew, are you still there? Let's walk through each line of code to get an idea of what's happening. First, we call our generator to produce an iterator and start execution by calling .next. So far so good, no rocket science evolved. Our generator starts calculating a value of x and encounters the first yield. Now the generator is paused and asks a question: What value should go here? The answer lies in a result of getData(10) function call. Here comes the interesting part: our homemade getData function, which is a fake async function, resumes a generator once it is done with calculating value. Here it is just a setTimeout, but it could be anything. So after 1000 milliseconds, our fake getData gives us a response and resumes a generator with the value of response. The next yield getData(20) is processed in a similar way. What we get here is synchronously looking asynchronous code. Our generator now is able to pause itself and resume when the async value is calculated in the exact same manner as it did with synchronous values. That's a huge deal.

The magic key

Because the generator employs this pause/resume thing he is able to block itself and wait for some background process to finish and then resume with the value we were waiting for. Abstract yourself from implementation details because it will be hidden in a library most of the time. What matters is the code inside a generator itself. Compare that to what we have seen in code using Promises. Promises' flow control organizes callbacks vertically into a chain. Think about Callbacks and Thunks - they are nesting those same callbacks. Generators bring their own flow control as well. But the very special feature of this flow control is that it looks completely synchronous. The async and sync code are sitting next to each other on equal terms. Neither do we see any difference nor do we have to think about organizing our async code in a different fashion anymore. Asynchronicity itself now is an implementation detail that we do not care about. It is possible because Generators introduced a syntactic way to hide the complexity of state machines, in our case, asynchronous state machine. You are also getting all of the benefits of synchronous code like error handling. You are able to handle errors in your async code, in the same manner, using try-catch blocks. Isn't that beautiful?

Purging the IOC

As you look at this example more carefully, you might notice that there is one problem with this approach. Our getData function is taking control of executing our generator which leads us to Inversion Of Control. This function gets to call .next method on our generator in an unexpected way and mess everything up and the current codebase has no solution to it. Guess what? We are not afraid of this previously terrifying problem anymore. We just need to recall which pattern has already solved this issue for us. We are going to mix Promises together with Generators! And for this union to happen, instead of yielding undefined we have to yield a promsie.

The ultimate duo

Let's imagine how we could make this work. We've already said that inside of our generator we need to yield a promise. But who will take care of resolving that promise? Well, that would be done by the code that drives the generator, that calls .next. And once it gets a promise it should do something to it, it will have to wait for a promise to resolve and resume a generator. We are in need of an additional abstraction that will do it for us and most likely this will be provided by a framework, or library, or JavaScript itself. It is unlikely to be a practical thing to do - reinventing the wheel each time you want to work with promisified generators. But for educational purposes, we will figure out one ourselves and study it.

Building our Promises Generator runner

I am going to provide you an implementation of such generator runner. Obviously, it lacks some of the features that are absolutely required if you want to use it in production, such as proper handling, but it covers our needs and demonstrates the concept perfectly while keeping things rather simple.

function runner (generatorFunction) {
  const iterator = generatorFunction();

  function nextStep(resolvedValue) {
    const { value: nextIteratorValue, done } = iterator.next(resolvedValue);

    if (done) return nextIteratorValue;

    return nextIteratorValue.then(nextStep)
  }

  return Promise.resolve().then(nextStep)
}

Our runner takes a generator function and produces an iterator as usual. Then it returns a resolved Promise and in .then method we are passing our worker function nextStep. It does a whole job of getting the next iterator value and checking if the generator is done. If not, we are assuming that the result of the .next call was a Promise. So we are returning a new Promise ourselves by waiting for the iterator value Promise to resolve and passing the value to our working function. The worker does the job of passing the result value to the iterator if it needs one and repeating its job until the generator is done. Nothing really complicated.

Working with our Generator Runner

We are going to further modify our sumIncrementedNumbers example to incorporate our new runner and take a look how we consume a promisified generator.

function getData (data) {
  return new Promise((resolve, reject) => {
    setTimeout(() => {
      resolve(data);
    }, 1000)
 })
}

function* sumIncrementedNumbersAsync () {
  const x = 1 + (yield getData(10));
  const y = 1 + (yield getData(20));
  return x + y;
}

runner(sumIncrementedNumbersAsync).then(value => {
  console.log(value) // After ~2000ms prints 32
});

Everything here should already be familiar to you. Since our runner eventually results into a Promise, from the outside world perspective our wrapped generator is nothing more than just another Promise. We have managed to solve non-local, non-sequential reasoning problems using our Generators to make async code look like synchronous one. We have brought Promises to do the dirty job of solving the Inversion Of Control issue and created our simple Promises Generator runner. Finally, we ended up with a clean interface of a Promise as a result and all of the Promises' benefits apply to our wrapped generator. That's why the Generators are so powerful. They completely change the way you write your asynchronous code. They finally provide you the ability to write a code that is intuitive for our brains and does not contradict the way we think.

Async/await ?

In fact, this pattern proved itself so useful that in 2017 ECMAScript rolled out its very own implementation of async generators by introducing async/await keywords. Don't let it fool you, because this feature is completely generator based and the concept is exactly the same. The difference is that now it is a first-class citizen in our language with proper syntax support and we are not required to use any helper libraries to do this job anymore. But there are some caveats with how async/await works right now.

Pure generators vs async/await

How would you cancel an async function and stop it from further execution? The thing is that there is no way to do so. Currently async/await just returns a Promise. That's cool and all, but the ability to cancel is too crucial to ignore. And current implementation just does not give you enough tools for finer control of execution. I am not the one to judge their design decisions but my point is that the API could be further improved to, for example, return both a promise and a cancel function. At the end of the day, we are working with generators that implement a pull interface. We are in control of how to consume an iterator. You could easily imagine how we could just stop consuming it in our runner if we would receive a cancel signal. To prove the point we can introduce a simple change to implement a very primitive cancel mechanism. And you could imagine somebody making a more sophisticated and error-proof variant with a rollback strategy.

function runner (generatorFunction) {
  let isCancelled = false;
  const iterator = generatorFunction();

  function nextStep(resolvedValue) {
    const { value: nextIteratorValue, done } = iterator.next(resolvedValue);

    if (done) return nextIteratorValue;

    if (isCancelled) {
      return Promise.resolve();
    }

    return nextIteratorValue.then(nextStep)
 }

return {
  cancel: () => isCancelled = true,
  promise: Promise.resolve().then(nextStep)
}

This illustrates my point above. We are returning an object both with the Promise and cancel method. The cancel method just toggles a flag variable that is contained via closure. Pretty neat and opens a lot of possibilities for further enhancements.

Outro

That was a lot of stuff to learn and discuss this time. But the topic itself is not the easy one and does not let you spend just 5 minutes of reading to get a grasp on it. I don't expect any of you to become generator experts by just completing this article, but I am pretty sure that I've given you a good start that will push you to further explore the topic yourself. With generators seems like we've answered each of our questions about async programming. We've solved Inversion of Control, we now are able to write synchronous looking asynchronous code, and looks like we've combined the best features from all of the previous patterns. But, as it often happens in Software Engineering, there is often more than one possible answer to the same problem. From this point, the next patterns we see will just offer you brand other ways of solving problems and each of them might be more or less suitable for your case. It's up to you as an engineer to make a final call. It will be completely fine if you quit at this point of the series because for most of us this could be enough to know about asynchronous programming in JavaScript for now. But if you decide to stick with me, we are going to take a look at some of the advanced patterns like CSP and Observables. We will surely have a talk about one of them next time. Thank you for the long read!

Credits

Big thanks to Kyle Simpson and his materials. I was particularly inspired by his Asynchronous JavaScript course and it pushed me to deep dive into these topics a lot harder than I would have done normally.

Latest comments (8)

LadiesMan217 • Jan 7 '22

Great article

Roman Sarder • Jan 13 '22

Thank you! Glad to hear it

Philippe Kalitine • Sep 30 '21

Well written! Explain a subject, in a clear way with simple examples, that shows a different engineering point of view of how to approach problems it's a difficult task. Usually this rises many questions while reading, but you manage to anticipate most of the questions and provide short, but yet answers, thus pulling the reader back to the topic. Thanks for the article!

Roman Sarder • Sep 30 '21

Thank you, Philippe, for a detailed explanations. Such things are more important for me at this point than anything else

I am glad that you found my article valuable. Hopefully, I will be able to deliver more of those in future :)

Antonello Ceravola • Sep 29 '21 • Edited

Very nice article! Very good explanation of generators. If you like to see an alternative approach you may look at JSEN: github.com/HRI-EU/JSEN
Or a paper on that: ronpub.com/OJWT_2021v8i1n01_Ceravo...

Roman Sarder • Sep 29 '21

Thank you for the feedback, Antonello!
And for the resources :)
I appreciate that

Mike Talbot ⭐ • Sep 27 '21

Nice article :) I use Generators in js-coroutines in conjunction with idle time processing to provide async versions of common functions like JSON parse, sorts, filters etc.

Roman Sarder • Sep 27 '21

Hey, Mike
Thanks for the link and feedback. The library looks pretty interesting, never have seen it before.
Gonna take a look at source code when I have spare time!
You will probably like the article about CSP that I am going to write at some point in this series :)

DEV Community