Stefan 🚀

Posted on May 18, 2023

GraphQL doesn't solve under & overfetching, Relay & WunderGraph does

#graphql #programming #webdev #react

When you look at positive articles about GraphQL,
you often hear how much better it is than REST and how it solves the problem of overfetching, underfetching and eliminates waterfalls,
but is that really true?

Let's explain the terms, look at the problems and evaluate if GraphQL does what we hear in all the praises of the Query language.

You'll learn that GraphQL does not solve the problems of overfetching, underfetching and waterfalls.
It's not even close to solving these problems.
Read on to understand why this is the case and how we can get the full benefits of GraphQL.

What is overfetching?

Overfetching is the problem of fetching more data than you need.
GraphQL advocates usually attribute this problem to REST APIs.
REST APIs are often designed to be generic and reusable across different use cases,
so it's not uncommon that a REST API returns more data than you actually need.

It's widely known that GraphQL has a native solution to this problem.
Through the use of the Selection Set, you can specify exactly which fields you need.
So, in theory, it's possible to solve the problem of overfetching with GraphQL.
However, we'll see later that this often stays a theory and doesn't work in practice.

Furthermore, the claim that REST APIs cannot solve this problem is also not true.
You can use sparse fieldsets to achieve similar functionality in REST APIs.

Here's an example:

GET /api/dragons?fields=id,name,description

The problem with this approach is that it's not standardized and not widely used compared to Selection Sets in GraphQL.

What is underfetching?

In addition to overfetching, GraphQL also claims to solve the problem of underfetching.
Underfetching is the problem of not fetching enough data in a single request.
This problem is often attributed to REST APIs as well.

In order to load all the data you need to present a page,
you have to make a lot of requests,
simply because a single request doesn't return all the data you need.
This problem comes from the fact that REST APIs are usually too granular, which they have to be in order to be reusable.

When listening to GraphQL advocates, it seems like GraphQL solves this problem as well.
You can simply specify all the data you need in a single Query and the GraphQL server will return all the required data in a single response.
Again, this works in theory, but does it work in practice?

What is the problem with waterfall requests?

If we look at the problem of underfetching, what actually is the consequence of this problem?
How does it manifest itself in a real-world application?

From the user's perspective, it's quite undesirable to have to wait a long time until the page is loaded.
This is even worse when the page loads incrementally over a period of 10-15 seconds.

Imagine a website with a lot of nested components,
each building on top of each other.
In order to render the page, you have to fetch data for each component.
This means that you have to wait for the data of the first component to be fetched before you can fetch the data for the second component.
This results in a long "waterfall" of requests because we can only render a child component when the data for the parent component is available,
which in turn can only be fetched when the data for the grandparent component is available and that component can be rendered.

The waterfall usually looks something like this:

Render root component
Fetch data for root component
Render child components
Fetch data for child components
Render grandchild components
Fetch data for grandchild components
...

This is a very simplified example, but it illustrates the problem quite well.

Luckily, with GraphQL we don't have the problems of waterfall requests, right?
Let's look at a few examples.

Real-world examples of GraphQL Underfetching, Overfetching & Waterfall requests

To better understand if the claims we hear about GraphQL are true,
let's look at a few real-world examples.

If you want to follow along, follow the URLs,
open the browser's network tab and have a look at the traffic.

Example 1: Reddit - 42 GraphQL requests to load a subreddit

Reddit is the first example we'll look at. Let's take a look at the GraphQL Subreddit.
We can count 42 requests, including preflight requests,
so let's just count 21 actual requests.

Reddit uses persisted operations and sends the operation ID in the POST request body.
The waterfall is huge.

Example 2: Twitter - 6 GraphQL requests to load the home page

Twitter is quite advanced, actually. If we look at the home page,
we can see a few things.

The website makes 6 requests and uses persisted operations.
Read requests leverage GET and the operation ID and variables are sent in the query string.
Quite good, but still a waterfall of 6 requests.

Example 3: twitch.tv - 16 GraphQL requests to load the home page

Looking at twitch, we can see that their home page makes 16 requests.
Twitch uses persisted operations and batching, so the 16 requests actually translate to a total of 32 requests.

Example 4: Glassdoor - 3 GraphQL requests to load the member home page

Next up, we have the Glassdoor member home page.
In the network tab, we can see 3 HTTP POST requests, which are GraphQL requests without persisted operations,
but they are using batching, so the 3 requests translate to a total of 7 GraphQL Operations.

Example 5: Nerdwallet - 16 GraphQL requests to load the page for best credit cards

Another example, Nerdwallet best credit cards,
makes 16 requests to an endpoint they call "supergraph", which might indicate what architecture they use.
No persisted operations, no batching, all HTTP POST over HTTP/3.

Example 6: Peleton - 3 GraphQL requests to load the bike page

Peleton's bike page makes 3 requests.
They send regular GraphQL requests over HTTP POST without persisted operations or batching.
I noticed that they have 2 different GraphQL endpoints.

Example 7: Priceline - 8 GraphQL requests to load a search page

The Search Page
of Priceline makes 8 requests.
They don't use persisted operations and send GraphQL requests over HTTP POST without batching.

Example 8: Facebook - 10 GraphQL requests to load the home page

Facebook's home page makes 10 requests.
I would have expected less, but have a look yourself.
They are sending persisted operations over HTTP POST.

Summary of my findings in real-world GraphQL applications

To summarize my findings, I was unable to find a single website that eliminates the problem of underfetching and waterfall requests.
Given the amount of GraphQL requests I've seen, it's also quite likely that a lot of the pages suffer from overfetching as well,
I'll explain this in more detail later.

GraphQL does not solve the problem of overfetching, underfetching and waterfall requests

Another observation is that most companies use a different style of GraphQL.
Some use persisted operations, some use batching, some use both, some use neither.
I found different ways of structuring URLs, some use query strings, some use the request body with HTTP POST.
There is no real standard, and you might be wondering why.

Why does GraphQL fail to solve underfetching and overfetching in practice?

What's often forgotten is that GraphQL is just a Query language.
It doesn't specify how the data is fetched.
The GraphQL specification doesn't say much about the transport layer,
it's really just a specification for the Query language itself.

So, in theory GraphQL has the capability to solve the problem of underfetching and overfetching,
but it takes more than just the Query language to solve these problems.
We need a client-server architecture combined with a workflow that encourages us to reduce the number of requests per page.

Let's break down the problem into smaller pieces.

GraphQL clients need to encourage the use of Fragments

When using GraphQL, there are two ways to specify the data you need.
You can either specify the data with a Query at the top of the component tree,
or you can use Fragments to specify the data at the component level.

Clients like Relay encourage the use of Fragments,
allowing deeply nested components to specify what data they need.
These fragments are then hoisted to the top of the component tree and combined into a single query.
This pattern is very powerful and comes with other benefits,
but most importantly, it allows us to keep components decoupled,
while still being able to fetch data efficiently with the least amount of requests.

I'll get back to Fragments later, but for now, let's look at the next problem.

GraphQL servers need to implement relationships between types over returning IDs

Coming from REST APIs, we are used to returning IDs from GraphQL resolvers.
E.g. the user resolver could return IDs for the posts of a user.
However, this would not allow us to fetch a tree of data (user -> posts -> comments) with a single request.
Instead, we would have to make a request for each post and then for each comment.

To solve this problem, GraphQL servers need to implement relationships between types.
So, instead of returning IDs, resolvers should return relationships so that the GraphQL client can fetch deeply nested data with a single request.

GraphQL clients and servers need to support persisted operations

Sending lots of GraphQL Operations over HTTP POST has two problems.
First, it's inefficient, because we are sending the same query over and over again.
Every client sends the same query, and the server has to parse the same query over and over again.
That's not just costly in terms of performance, wasted bandwidth and CPU cycles,
but also has security implications.

Opening up a GraphQL endpoint to the public means that anyone can send any query to the server.
The server needs to parse and validate requests before they can be executed.
As we cannot trust the client, we need to harden the server against any kind of attack,
which is more or less impossible as it's a cat and mouse game.
You can protect against all known attacks,
but eventually someone will find a new attack vector.

It's much better to use persisted operations over exposing a public GraphQL endpoint.
With persisted operations, you're registering all known Operations on the server during build time.
This doesn't just allow you to validate Operations ahead of time,
but also optimize them, strip out unused or duplicate fields, and so on.

Each Operation gets a unique ID, so in production, the client will just send this ID and the variables to the server.
The server can then execute the Operation without having to parse and validate it.
The only thing the server needs to do is to look up the Operation by ID, validate the variables and execute the Operation.
That's a lot less work than parsing and validating the Operation.

The topic of this post was about overfetching and underfetching,
so you might be wondering why I spend to much time talking about persisted operations.
Performance and security concerns aside, persisted operations help us achieve two things.

We're sending a lot less data in the request, because we're not sending the content of the Operation.
As we're working with larger companies, Operations can be as large as 17KB.
Imagine if every client has to send 17kb of data for every request.

Second, persisted operations allow us to "optimize" the Operation at build time.
Let's get to that final point.

How Relay & WunderGraph solve the problem of underfetching, overfetching and waterfall requests

I want to emphasize that there's no silver bullet,
and there's definitely a learning curve when using Relay & WunderGraph for the first time,
but the benefits are huge.

Combining Relay with WunderGraph allows us not just to solve the problem of inefficient data fetching,
but we're also making our applications a lot more secure, faster, and easier to maintain.
Let's have a look at how the two open source projects can work together.

Let's take a look at some (simplified) example code to better illustrate the benefits.

// in pages/index.tsx
export default function Home() {
  const data = useQuery(
    graphql`
      query Home_posts on Query {
        posts {
          ...BlogPost_post
        }
      }
    `,
    null
  );

  return (
    <div>
      {data.posts.map((post) => (
        <BlogPost key={post.id} post={post} />
      ))}
    </div>
  );
}

// in components/BlogPost.tsx
export default function BlogPost({ post }: { post: Post }) {
  const data = useFragment(
    graphql`
      fragment BlogPost_post on Post {
        title
        content
        comments {
          ...Comment_comment
        }
      }
    `,
    post
  );

  return (
    <div>
      <h1>{data.title}</h1>
      <p>{data.content}</p>
      <div>
        {data.comments.map((comment) => (
          <Comment key={comment.id} comment={comment} />
        ))}
      </div>
    </div>
  );
}

// in components/Comment.tsx
export default function Comment({ comment }: { comment: Comment }) {
  const data = useFragment(
    graphql`
      fragment Comment_comment on Comment {
        title
        content
      }
    `,
    comment
  );

  return (
    <div>
      <h2>{data.title}</h2>
      <p>{data.content}</p>
    </div>
  );
}

Through the use of Fragments, we can specify exactly the data we need at the component level.
This allows us to keep components decoupled, while still being able to fetch data efficiently with the least amount of requests.
Furthermore, colocating the data requirements with the component ensures that we really only fetch the data we need.

The use of useFragment has an additional benefit, as it acts as a data mask.
Instead of simply using the data from the parent component,
the Fragment ensures that we only use data that is specified in the Fragment.

Not using this approach ultimately leads to using data that was requested by other components,
adding unnecessary coupling between individual components.
If it's not clear where the data comes from,
developers will be afraid of removing fields from Queries,
because they don't know if they are used somewhere else.
This will ultimately lead to overfetching.
With Fragments and Relay on the other hand, we can be sure that we only use the data we need.

The Relay compiler will hoist all Fragments to the top of the component tree and combine them into a single query for us.
Once this process is done, it will generate a hash for each Operation and store it in the wundergraph/operations/relay/persisted.json file.
WunderGraph will pick up this file and expand it into GraphQL Operations files,
with the hash as the name, and the Operation as the content.
Doing so will automatically register the Operation as a JSON RPC Endpoint on the WunderGraph Gateway.

E.g. if the hash of the Operation is sha123, WunderGraph will create a file called sha123.graphql with the following content:

query Home_posts {
  posts {
    ...BlogPost_post
  }
}

fragment BlogPost_post on Post {
  title
  content
  comments {
    ...Comment_comment
  }
}

fragment Comment_comment on Comment {
  title
  content
}

This Operation will be registered as a JSON RPC Endpoint on the following URL: http://localhost:9991/operations/relay/sha123.
We could use curl to execute this Operation with a simple HTTP GET request:

curl http://localhost:9991/operations/relay/sha123

As you can see, there's no need to send the content of the Operation,
we're just sending the ID of the Operation and the variables (in this case there are no variables).
As we're registering the Operation on the server during the build process,
we can be sure that the Operation is valid, normalized and optimized at build time.

This doesn't just limit the amount of data we're sending over the wire,
but also reduces the attack surface of our application to the bare minimum.

Now you might ask yourself, how does the Relay runtime know how to execute the Operation using the WunderGraph JSON RPC Protocol?
That's where code generation and the WunderGraph Relay Integration comes into play.

First, Relay doesn't just persist the Operations, it also generates TypeScript types for us.
For each GraphQL Operation, it generates a TypeScript file that contains the Operation Hash alongside types for the Fragment.
The WunderGraph Relay Integration will pick up the Operation Hash, the type of the Operation (Query, Mutation, Subscription) and the variables
to make a JSON RPC Request to the WunderGraph Gateway.

All of this happens automatically by wrapping your application in the WunderGraphRelayProvider component:

// in pages/_app.tsx
import { WunderGraphRelayProvider } from '@/lib/wundergraph';
import '@/styles/globals.css';
import type { AppProps } from 'next/app';

export default function App({ Component, pageProps }: AppProps) {
    return (
        <WunderGraphRelayProvider initialRecords={pageProps.initialRecords}>
            <Component {...pageProps} />
        </WunderGraphRelayProvider>
    );
}

There's one more thing to note for Operations that use Variables.
During development, WunderGraph will automatically parse each Operation and extract the variable definitions.
We will then generate a JSON Schema for each Operation.
This JSON Schema will be used at runtime to validate the variables before executing the Operation.
This adds a layer of security and can even be customized,
so you could e.g. add custom validation rules like a minimum length for a password or a regex for an email address input field.

Conclusion

As you can see, Relay and WunderGraph are a perfect match.
The Relay compiler generates efficient GraphQL Operations and helps us to keep our components decoupled.
Together with WunderGraph, we can make sure to fetch the data as efficiently as possible while keeping our application secure.

As you can see from the examples above, I think there's a lot of potential to "get GraphQL right".
Our goal is to provide a standardized way to use GraphQL in your application without all the pitfalls.
We believe that Relay is underappreciated and that it needs to be easier to use and more accessible,
which is what we're trying to achieve with WunderGraph.

I'm really looking forward to your feedback and ideas on how we can improve the developer experience even further.

If you got interested in trying out the approach, you might want to look at some of these examples

One more thing.
This is really just the beginning of our journey to make the power of GraphQL and Relay available to everyone.
Stay in touch on Twitter or join our Discord Community to stay up to date,
as we're soon going to launch something really exciting that will take this to the next level.

DEV Community