DEV Community

Cover image for Delving into the Black Magic of GraphQL DataLoader! 🌌✨
Hleb Bandarenka
Hleb Bandarenka

Posted on

Delving into the Black Magic of GraphQL DataLoader! 🌌✨

Who should read this?

If you've got GraphQL experience, encountered the N+1 problem, and used DataLoader to solve it but are still unclear on how it works, you're in the right spot.

Prerequisites:

  • Familiarity with the DataLoader pattern
  • Understanding of the Event Loop

Let's rock πŸš€πŸ€˜πŸŽΈ

When I began working with GraphQL, I had concerns about the N+1 query problem. In my research, I came across the DataLoader pattern and its implementation on GitHub. While I explored various examples of its usage, I still struggled to grasp how it operates internally. Join me in delving a bit deeper into GraphQL DataLoader! :)

I hope my readers are already familiar with how the EventLoop works. If not, I highly recommend checking out this insightful series of articles with excellent visualizations here.

For our current discussion, pay special attention to Part 2 - Bonus experiment. This experiment demonstrates that the nextTick operation triggered inside a Promise will execute after all other Promises have been completed.

Why is this crucial? ❗️❗️❗️

This experiment illustrates that the nextTick operation triggered inside a Promise will be executed after all other Promises are completed. The emphasis on this aspect is crucial because DataLoader leverages this peculiarity to perform its magic in the enqueuePostPromiseJob. 🎩✨

code

// `enqueuePostPromiseJob` function
...
if (!resolvedPromise) {
  resolvedPromise = Promise.resolve();
}
resolvedPromise.then(() => {
  process.nextTick(fn);
});
...
Enter fullscreen mode Exit fullscreen mode

When you create a DataLoader, you provide a BatchLoadFn as a constructor argument. The enqueuePostPromiseJob serves as the default batchScheduleFn, responsible for scheduling when the BatchLoadFn (the fn argument is dispathBatch -> BatchLoadFn) should be invoked.

This process kicks off when you first invoke the load method on DataLoader. That's when enqueuePostPromiseJob starts its job. 🀯

Did my message make sense? I'm a bit unclear myself.
I hope the schema I provided will help clarify what I wanted to say.

Image description

What does it mean for us? πŸ€”

It signifies that DataLoader gathers all IDs passed during synchronous invocation, including those within nextTick and Promises, even if the Promise IDs were defined within another Promise. However, it doesn't include nextTick IDs added inside a Promise.

P.S. This is true if the initial load was invoked synchronously; otherwise, all IDs will be collected.

Image description
Code on Github

index.js

const DataLoader = require("dataloader");
const db = require("./database");

// Create a batch loading function
async function batchLoadFunction(ids) {
  const results = await db.findAll(ids);

  // Return the results in the same order as the keys
  return ids.map((key) => results.find((result) => result.id === key));
}

// Create a new DataLoader instance
const dataLoader = new DataLoader(batchLoadFunction);

// Use the DataLoader to load data
(async () => {
  // 1. Sync calls
  const p1 = dataLoader.load(1);
  const p2 = dataLoader.load(2);
  Promise.all([p1, p2]).then((results) => {
    console.log(results);
  });

  // 2. Next tick calls
  process.nextTick(() => {
    console.log("next tick");
    const p3 = dataLoader.load(3);
    const p4 = dataLoader.load(4);
    Promise.all([p3, p4]).then((results) => {
      console.log(results);
    });
  });

  // 3. Promise calls
  Promise.resolve().then(() => {
    console.log("promise");
    const p5 = dataLoader.load(5);
    const p6 = dataLoader.load(6);
    Promise.all([p5, p6]).then((results) => {
      console.log(results);
    });

    // 4. Next tick inside promise
    process.nextTick(() => {
      console.log("next tick inside promise");
      const p7 = dataLoader.load(7);
      const p8 = dataLoader.load(8);
      Promise.all([p7, p8]).then((results) => {
        console.log(results);
      });
    });

    // 5. Promise inside promise
    Promise.resolve().then(() => {
      console.log("promise inside promise");
      const p9 = dataLoader.load(9);
      const p10 = dataLoader.load(10);
      Promise.all([p9, p10]).then((results) => {
        console.log(results);
      });
    });
  });
})();
Enter fullscreen mode Exit fullscreen mode

Result:

next tick
promise
promise inside promise resolve handle
Querying ids: 1,2,3,4,5,6,9,10
next tick inside promise resolve handle
[ { id: 1, name: 'John', age: 25 }, { id: 2, name: 'Jane', age: 30 } ]
[ { id: 3, name: 'Bob', age: 35 }, { id: 4, name: 'Alice', age: 28 } ]
[ { id: 5, name: 'Mike', age: 32 }, { id: 6, name: 'Sarah', age: 27 } ]
[ { id: 9, name: 'Michael', age: 31 }, { id: 10, name: 'Sophia', age: 26 } ]
Querying ids: 7,8
[ { id: 7, name: 'David', age: 33 }, { id: 8, name: 'Emily', age: 29 } ]
Enter fullscreen mode Exit fullscreen mode

Clearly, the next tick inside promise was triggered after querying the BatchLoadFn. Consequently, the IDs from that nextTick joined the second invocation of BatchLoadFn.

How can we use it?

Soooo, if we incorporate DataLoader within Promises, everything will function as anticipated. Now, we have a clear understanding of the reasons behind it. πŸ˜ŠπŸŽ‰


I hope this post has shed some light on the subject. If you're keen on a more in-depth understanding, I encourage you to take a look at the source code yourself. At the very least, you now have a solid foundation of understanding.

Bonus

How does this pattern operate in other languages, like Java? Unfortunately, due to Java's threading mechanism, where all threads share equal priority, the solution is not as elegant and necessitates manual dispatching.

dataloader.load("A");
dataloader.load("B");
dataloader.load("A");

dataloader.dispatch(); // in Java you have to manually invoke `dispatch` function
Enter fullscreen mode Exit fullscreen mode

Top comments (0)