DEV Community

Cover image for The Myth of GraphQL
ASafaeirad
ASafaeirad

Posted on

The Myth of GraphQL

It's often said that GraphQL fixes the problems of under-fetching and over-fetching. But is that really the case? In theory, it sounds promising. In practice, however, you might be trading problems for a pile of new ones that even the most sophisticated frameworks struggle to solve.

The Temptation to Put Everything in a Single Request

Imagine you're at an all-you-can-eat buffet, and someone advises you,

Just load up your plate with everything you might possibly want in one go; it'll save you trips!

Sounds efficient, right? That's akin to what GraphQL suggests:

Pack as much data as you need into a single request to avoid under-fetching, and I'll let you specify exactly what you want to prevent over-fetching.

Let's follow the advice!
To achieve this in a React application, we might hoist our data-fetching logic up to the highest level and pass down this huge data object to our presentational components.

Here is an example data schema we get for a query using Hasura and GraphQL-Codegen

export type ProjectQuery = {
  __typename?: 'query_root',
  project_by_pk?: {
    __typename?: 'project',
    id: string,
    name: string,
    description?: string | null,
    status: SchemaTypes.ProjectStatusEnum,
    start_date?: string | null,
    due_date?: string | null,
    created_at: string,
    updated_at: string,
    households: Array<{
      __typename?: 'household_project',
      household: {
        __typename?: 'household',
        id: string,
        name: string,
        status: SchemaTypes.HouseholdStatusEnum,
        severity: SchemaTypes.HouseholdSeverityEnum,
        code?: string | null,
        created_at: string,
        updated_at: string,
        members_count?: number | null
    }}>
  } | null
};
Enter fullscreen mode Exit fullscreen mode

Now, instead of neat, modular components with their own data queries, we have a massive, monolithic object—with an ad-hoc schema full of random nulls.

Step 1: We've just sacrificed co-location! But on the bright side, we have presentational components instead 🎉

The Quest for a Meaningful Schema

You're right to say: "Why is the schema ad-hoc? it's a skill issue. Can't we make some meaningful entities?"
One approach is to create fragments like ProjectStatusFragment, HouseholdIdentityFragment, and HouseholdMembersFragment, and enforce their usage across the team.

But wait—do we need all the data behind these fragments every time? Fragments are meant to be reusable, but reusability can lead to over-fetching, which contradicts GraphQL's main promise:

Query exactly what you need on the client—no more, no less.

In the real world, use cases are infinite. To create meaningful fragments without overfetching, we'd need to create an infinite number of fragments. That's neither practical nor efficient. So we default back to flexible schemas, letting each use case decide what data it needs.

This leads us back to square one with a lesson:

Every abstraction layer and reusability introduces data over-fetching, contradicting GraphQL's core promise.

The Problem of Nulls

Why are there so many random nulls in our data? The answer lies in GraphQL's design decisions regarding nullability:

TL;DR

In GraphQL, every field and every type is nullable by default. ... By defaulting every field to nullable, any of field failure may result in just that field returning "null" rather than having a complete failure for the request.

This means our schemas are riddled with optional fields, leading to a data structure filled with nulls. It's not necessarily a bad design choice, but it's the reality when we work with GraphQL.

Returning to the Core Issue

Now, without any skill issues, we're stuck with a massive data with an ad-hoc and partial schema. We need to pass this data to our presentational components, but how?

Option 1: Prop Drilling

One option is prop drilling. But is it practical to pass such a data schema without losing our sanity? Not really.

Consider the purpose of presentational components: they are free of side effects, loosely coupled, and therefore reusable and easy to test. By passing down this enormous, loosely typed object, we're tightly coupling our components to a specific query structure.

type Props = {
  households: Array<{
      __typename?: 'household_project',
      household: {
        __typename?: 'household',
        id: string,
        name: string,
        status: SchemaTypes.HouseholdStatusEnum,
        severity: SchemaTypes.HouseholdSeverityEnum,
        code?: string | null,
        created_at: string,
        updated_at: string,
        members_count?: number | null
    }}>
  } | null
}

const HouseholdList = ({ households } : Props) => {}
Enter fullscreen mode Exit fullscreen mode

Tight dependency isn't just about what a component uses or imports. In software development, dependency means "What information is this part of the code aware of?" When a piece of code is aware of specific information, it becomes responsible for reacting whenever that information changes. This means our HouseholdList component isn't just using the data; it is coupled to the exact structure of our query results. As a result, any change in the query triggers a change in our component's high-level API.

Is it tightly coupled? Absolutely.
Is it easy to test? Not at all.

Presentational components aren't free. They depend on their parent components to handle responsibilities and side effects like data fetching. By shifting this responsibility away from the components themselves, we introduce duplication. Every time we reuse these components in different contexts, we have to replicate the same data-fetching logic in their parent components.

In this scenario, we get the worst of both worlds: we don't reap the benefits of presentational components, but we still pay the costs.

And let's not forget, our data is littered with nulls. The bigger question is should our components accept nullable values just because our I/O isn't reliable?

Here's the next lesson:

Passing a raw query result as props, couples our components to the unpredictability of I/O.

Searching for Meaningful Interfaces

To untangle this mess, we might try to create meaningful, decoupled interfaces. We'll map our unwieldy data to what each component needs, embracing abstraction.

But here's the kicker: good abstraction clashes with the just query what you need approach.

Why?

Let's attempt to create a Project entity and a mapper function:

type Project = {
  id: string;
  name: string;
  dueDate?: Date;
}

function toProject(data: X): Project { /* ... */ }
Enter fullscreen mode Exit fullscreen mode

But what is X? If we assume it's the generated Project type from our GraphQL schema, we're in trouble. Consider this query:

const { data } = useQuery(gql`{ projects { id, dueDate } }`);
Enter fullscreen mode Exit fullscreen mode

This data lacks the fields needed to map to our Project entity. We can't reliably map partial data to a full entity without risking runtime errors or inconsistent state.

Option 2: Using Context

Okay, maybe crafting meaningful interfaces is off the table, but we can prevent our component interfaces from getting polluted by avoiding prop drilling altogether. "Aha! We'll use React's Context API!" We set up a provider and pass our data through context!

const Page = () => {
  const query = usePageQuery();

  return (
    <MyProvider value={query}>
      <MyChildren />
    </MyProvider>
  );
}

const MyChildren = () => {
  const { data, loading , error } = use(MyProvider);
}
Enter fullscreen mode Exit fullscreen mode

But hold on—aren't we just coupling MyChildren to usePageQuery via context? It's not transparent as we are doing it via dependency injection but we'll get to that in a second. We have a bigger problem, since ApolloClient provides a cache with ApolloProvider, we're adding redundant layers here.

Simplifying our code, we might write:

const Page = () => {
  usePageQuery();
  return <MyChildren />;
};

const MyChildren = () => {
  const { data } = usePageQuery({ fetchPolicy: "cache-only" });
  // Component logic
};
Enter fullscreen mode Exit fullscreen mode

Now you see me! Context doesn't solve our fundamental problem; it just obscures it.

The Challenge of Render-As-You-Fetch

In many cases, we don't need all the data upfront to start rendering. When we combine everything into one huge request, we make it harder to render parts of our application as soon as their data arrives.

Yes, we can use directives like @defer, but implementing them adds layers of complexity to both the client and server.

Additionally, sometimes, we need different strategies for different data. For instance, we might want to render part of the data on the server and the rest on the client. In this case, we need to break our query into at least two separate queries. (Did I just miss dynamic and static data 🤔)

const Page = () => {
  const serverQuery = useServerPageQuery();
  const clientQuery = useClientPageQuery({ ssr: false });
  /* ... */
}

Enter fullscreen mode Exit fullscreen mode

Cache Invalidation: The Hidden Beast

When we mutate data, we need to update our cache. Sometimes, optimistic updates and manual cache manipulation aren't feasible. The safest route is often to refetch.

But with our all-in-one query, refetching means fetching the entire dataset again—a heavy, inefficient operation.

Is there a solution? Perhaps, but it would require sophisticated infrastructure that goes beyond what most app developers should implement. We're talking about systems that can intelligently manage partial cache invalidation.

The Cost of Chasing Zero Over-Fetching and Under-Fetching

Let's tally up the costs of striving for zero over-fetching and under-fetching:

  • Coupled Presentational Components
  • No Co-location
  • Low Signal-to-Noise Ratio: Massive generated types and null handling clutter our codebase.
  • Complex Render Strategies
  • Cache Management Nightmares:

thanos meme

Is it worth it?

A Reality Check

In practice, many teams abandon the ideal of crafting minimal, all-encompassing queries. Instead, they opt for smaller, reusable data-fetching hooks like useUser, useComments, and useWhatever. They also leverage fragments to promote reusability and define cohesive entities within their GraphQL schemas.

But wasn't GraphQL's main selling point that it's a query language for the client—allowing us to request data in exactly the shape we need? Yet, in practice, we're using it more like a simple SDK, making straightforward data requests. Aren't we just replicating what could be achieved with RPC or REST calls, but with added complexity?

And yes, I recognize that GraphQL isn't inherently bad—it does solve certain problems more effectively than other solutions. It offers flexibility, strong typing, and a unified interface for data fetching. However, as app developers, I believe it's time to rethink what we truly gain from using GraphQL before adopting it.

If you're a tech giant like Facebook, equipped to build and maintain the sophisticated frameworks required to harness GraphQL's full potential, then by all means, leverage it.

However, for most small to medium-sized enterprises, adopting GraphQL without the necessary resources leads to complexity and frustration. Based on my experience, it often results in a tangled mess rather than streamlined data management.

Top comments (7)

Collapse
 
krd8ssb profile image
Steven Brown • Edited

I'd like to start this with noting that I absolutely love GraphQL but it has it place and that place is not everywhere. I work as a staff engineer, team lead, and subgraph owner, at one of the largest federated GraphQL implementations to date at Walmart Global Tech.

Most of what you stated is pretty accurate from a front end perspective as you primarily focused on under/over-fetching but there are definitely ways to mitigate some of the headaches. You're type generation tooling is probably one of the best friends when consuming a GraphQL API but that relies on another important factor. Schema design.

Schema design can make or break a frontend engineer. The schemas are the gatekeeper to your sanity.

  • when or when not to use nullable fields
  • how you return errors - application versus GraphQL errors.
  • how you organize your graphs (namespacing for larger graphs)
  • consistency in naming conventions
  • when to use ENUMs and when not to

Federation:

  • how and when to use reference resolvers.
  • Creating better reference resolvers for near-perfect error communication

And the list really goes on.

One of the things mentioned was the nullable fields. That is a very important part of schema design

Take an array definition in GraphQL (I'm doing this on my phone so please excuse any minor syntax issues if I miss an autocorrect)

type arrayExample {
 propName: [String]
}
Enter fullscreen mode Exit fullscreen mode

This is a horrible array design here. It's a nightmare for the front end.
propName could have the following outputs:

- null
- [null]
- []
- ["some value"]
Enter fullscreen mode Exit fullscreen mode

A better practice would be to stick to the convention of:

type arrayExample {
 propName: [String!]!
}
Enter fullscreen mode Exit fullscreen mode

Adding the ! after String and after the array means that propName must not be null (must return an array) and may be either an empty array or an array containing 1 or more strings.

That reduces us to the following potential outputs:

- []
- ["some value"]
Enter fullscreen mode Exit fullscreen mode

This is just one example but it makes a night and day difference to a front end developer.

Collapse
 
frontendmonster profile image
ASafaeirad

Hi @krd8ssb,

Thank you for taking the time to read my article and for sharing your valuable insights! I really appreciate hearing from someone with extensive experience in GraphQL, especially at the scale you're working with.

You're absolutely right—this article is heavily client-focused, and I agree that schema design is pivotal in GraphQL and can make or break the developer experience on the front end. However, some of the points I mentioned have roots in GraphQL's inherent design decisions and cannot be entirely solved through better schema design alone.

In my article, I aimed to highlight the reality of how many companies use GraphQL in practice, even when following best practices. To better illustrate this, I used some industry standard tools (like Hasura and GraphQL Code Generator) and avoided random GraphQL schemas and practices.

Also, poor schema design can indeed lead to issues like unnecessary nulls. However, even with a perfectly designed GraphQL schema, we still have to have nulls due to the inherent nature of dealing with I/O—as GraphQL's own best practices suggest.

I've tried my best to demonstrate that these problems are inherent in GraphQL's design and not merely the result of skill issues.

However, I still agree there can be way more benefits if we increase the scope beyond the client side only.

Collapse
 
anmolbaranwal profile image
Anmol Baranwal

Interesting read. I think both have their own merits (based on how huge stuff we are working on) but we often end up complicating things more than we realize. Still, companies like Medium use GraphQL for obvious reasons.

Collapse
 
mahdi_sheibak profile image
Mahdi Sheibak

Thank you, very useful article!

Collapse
 
srbhr profile image
Saurabh Rai

Here, GraphQL. It does this:

Image description

Collapse
 
kambing86 profile image
Chua Kang Ming • Edited

I have to say, this article stated the issues that are quite valid, but I'm not seeing any solution, is it trying to say that the REST has no such issue? These are not the cost of changing from REST to GraphQL because all these issues happened in REST, so I think this is quite a bad article.

GraphQL at least gives the frontend developers a reliable data contract that we could use to do codegen with TypeScript, that single reason itself is powerful enough for every system to use GraphQL

Collapse
 
frontendmonster profile image
ASafaeirad

Thanks for your comment! The article isn’t about promoting REST or denying its issues but rather highlighting what we need to sacrifice when adopting GraphQL, particularly on the client side. While GraphQL offers benefits like introspection and reliable type generation, these aren’t exclusive to GraphQL—technologies such as OpenAPI and Orval for REST or tRPC can provide almost similar type-safety and can be utilized in appropriate scenarios.