DEV Community

Cover image for Next.js: How <Suspense /> and Components Streaming works?
Denis Gonchar
Denis Gonchar

Posted on

Next.js: How <Suspense /> and Components Streaming works?

There is a distinct difference between ‘suspense’ and ‘surprise’, and yet many pictures continually confuse the two. I’ll explain what I mean. We are now having a very innocent little chat. Let’s suppose that there is a bomb underneath this table between us. Nothing happens, and then all of a sudden, ‘Boom!’ There is an explosion. The public is surprised, but prior to this surprise, it has seen an absolutely ordinary scene, of no special consequence.

Alfred Hitchcock

In this article, we'll dive into the specifics of the <Suspense /> tag as it relates to Next.js and Server Side Rendering (SSR) feature. We'll delve deeper to see what happens at the HTTP protocol level when you wrap your components with the tag. Let's begin.

Streaming, what is it?

Before we dive into "Components Streaming" it's essential to understand what HTTP streaming is in and of itself. When your User Agent (for example, a browser or a curl command) sends an HTTP request to a server, the server replies with something like:

HTTP/1.1 200 OK␍␊
Date: Mon, 27 Jul 2009 12:28:53 GMT␍␊
Content-Length: 12␍␊
Content-Type: text/plain␍␊
␍␊
Hello World!
Enter fullscreen mode Exit fullscreen mode

I've added the ␍␊ to HTTP response texts because it carries a special meaning in HTTP.

The first line, HTTP/1.1 200 OK, tells us that everything is fine and the server has responded with a 200 OK code. Following this, we have three lines that are known as headers. In our example, these three headers are Date, Content-Length, and Content-Type. We can think of them as key-value pairs, where the keys and values are delimited by a : sign.

Following the headers, there's an empty line, serving as a delimiter between the header and the body sections. After this line, we encounter the content itself. Given the prior information from the headers, our browser understands two things:

  1. It needs to download 12 bytes of content (the string Hello World! comprises just 12 characters).
  2. Once downloaded, it can display this content or provide it to the callback of a fetch request.

In other words, we can deduce that the end of the response body will occur once we've read 12 bytes following a new line.

Now, what happens if we omit the Content-Length header from our server response? When the Content-Length header is absent, many HTTP servers will implicitly add a Transfer-Encoding: chunked header. This type of response can be interpreted as, "Hi, I'm the server, and I'm not sure how much content there will be, so I'll send the data in chunks." So a response will look like:

HTTP/1.1 200 OK␍␊
Date: Mon, 27 Jul 2009 12:28:53 GMT␍␊
Transfer-Encoding: chunked␍␊
Content-Type: text/plain␍␊
␍␊
5␍␊
Hello␍␊
Enter fullscreen mode Exit fullscreen mode

At this point, we haven't received the entire message, only the first 5 bytes. Notice how the format of the body differs: first, the size of the chunk is sent, followed by the content of the chunk itself. At the end of each chunk, the server adds a ␍␊ sequence.

Now, let's consider receiving the second chunk. How might that appear?

HTTP/1.1 200 OK␍␊
Date: Mon, 27 Jul 2009 12:28:53 GMT␍␊
Transfer-Encoding: chunked␍␊
Content-Type: text/plain␍␊
␍␊
5␍␊
Hello␍␊
7␍␊
 World!␍␊
Enter fullscreen mode Exit fullscreen mode

We've received an additional 7 bytes of the response. But what transpired between Hello␍␊ and 7␍␊? How was the response processed in that interval? Imagine that before sending the 7, the server took 10 seconds pondering the next word. If you were to inspect the Network tab of your browser's Developer Tools during this pause, you'd see the response from the server had started and remained "in progress" throughout these 10 seconds. This is because the server hadn't indicated the end of the response.

So, how does the browser determine when the response should be treated as "completed"? There's a convention for that. The server must send a 0␍␊␍␊ sequence. In layman's terms, it's saying, "I'm sending you a chunk that has zero length, signifying there's nothing more to come." In the Network tab, this sequence will mark the moment the request has concluded.

HTTP/1.1 200 OK␍␊
Date: Mon, 27 Jul 2009 12:28:53 GMT␍␊
Transfer-Encoding: chunked␍␊
Content-Type: text/plain␍␊
␍␊
5␍␊
Hello␍␊
7␍␊
 World!␍␊
0␍␊
␍␊
Enter fullscreen mode Exit fullscreen mode

The Nuances of HTTP Transmission

In the realm of HTTP headers, understanding the distinction between Content-Length: <number> and Transfer-Encoding: chunked is crucial. At a first glance, Content-Length: <number> might suggest that data isn't streamed, but this isn't entirely accurate. While it's true that this header indicates the total length of the data to be received, it doesn't imply that data is transmitted as a single massive chunk. Underneath the HTTP layer, protocols like TCP/IP dictate the actual transmission mechanics, which inherently involves breaking data down into smaller packets. So, while the Content-Length header gives a system the signal that once it accumulates the specified amount of data, it's ready for rendering, the actual data transfer is executed incrementally at a lower level. Some contemporary browsers, capitalizing on this inherent packetization, initiate the rendering process even before the entire data is received. This is particularly beneficial for specific data formats that lend themselves to progressive rendering. On the other hand, the Transfer-Encoding: chunked header offers more explicit control over data streaming at the HTTP level, marking each chunk of data as it's sent. This provides even more flexibility, especially for dynamically generated content or when the full content length is unknown at the outset.

<Suspense />

Alright, now we've grasped one foundational concept that's pivotal for Component Streaming in Next.js. Before delving into <Suspense />, let's first articulate the problem it addresses. Sometimes, seeing is more instructive than a lengthy explanation. So, let's craft a helper function for illustration:

export function wait<T>(ms: number, data: T) {
  return new Promise<T>((resolve) => {
    setTimeout(() => resolve(data), ms);
  });
}
Enter fullscreen mode Exit fullscreen mode

This function will assist us in simulating exceedingly prolonged, fake requests.

To start, initialize a Next.js app using npx create-next-app@latest. Clear out any unnecessary elements, and paste the following code into app/page.tsx:

import { wait } from "@/helpers/wait";

const MyComponent = async () => {
  const data = await wait(10000, { name: "Denis" });
  return <p>{data.name}</p>;
};

export const dynamic = "force-dynamic";

export default async function Home() {
  return (
    <>
      <p>Some text</p>
      <MyComponent />
    </>
  );
}
Enter fullscreen mode Exit fullscreen mode

This structure provides a simple page layout: a text block containing “Some text” and a component that waits for 10 seconds before outputting the data.

Now, execute npm run build && npm run start followed by a curl localhost:3000 (or try to open it in a browser) command. What do we observe?

We experience a delay of 10 seconds before receiving the entire page content, including both “Some text” and “Denis”. For users, this means they won't be able to view the “Some text” content while <MyComponent /> is fetching its data. This is far from ideal; the browser tab's spinner would keep spinning for a solid 10 seconds before displaying any content to the user.

However, by wrapping our component with the <Suspense/> tag and trying again, we observe an instantaneous response. Let's delve into this method. We encase our component in <Suspense> and also assign a fallback prop with the value "We are loading...".

export default async function Home() {
  return (
    <>
      <p>Some text</p>
      <Suspense fallback={"We are loading..."}>
        <MyComponent />
      </Suspense>
    </>
  );
}
Enter fullscreen mode Exit fullscreen mode

Now let us open it in a browser.
When you inspect the Network tab in DevTools, you'll observe that the server's response is still ongoing or "hasn't yet completed." Examining the "Response Headers" section of the request, you'll find the "Transfer-Encoding: chunked" entry.

Now, we observe that the string provided as the fallback prop for <Suspense /> temporarily stands in for the <MyComponent />. After the 10-second wait, we're then presented with the actual content. Let's scrutinize the HTML response we've received.

<!DOCTYPE html>
<html lang="en">
<head>
    <!-- Omitted -->
</head>
<body class="__className_20951f">
    <p>Some text</p><!--$?-->
    <template id="B:0"></template>
    Waiting for MyComponent...<!--/$-->
    <script src="/_next/static/chunks/webpack-f0069ae2f14f3de1.js" async=""></script>
    <script>(self.__next_f = self.__next_f || []).push([0])</script>
    <script>self.__next_f.push(/* Omitted */)</script>
    <script>self.__next_f.push(/* Omitted */)</script>
    <script>self.__next_f.push(/* Omitted */)</script>
    <script>self.__next_f.push(/* We haven't received a chunk that closes this tag...
Enter fullscreen mode Exit fullscreen mode

While we haven't yet received the complete page, we can already view its content in the browser. But why is that possible? This behavior is due to the error tolerance of modern browsers. Consider a scenario where you visit a website, but because a developer forgot to close a tag, the site doesn't display correctly. Although browser developers could enforce strict error-free HTML, such a decision would degrade the user experience. As users, we expect web pages to load and display their content, regardless of minor errors in the underlying code. To ensure this, browsers implement numerous mechanisms under the hood to compensate for such issues. For instance, if there's an opened <body> tag that hasn't been closed, the browser will automatically "close" it. This is done in an effort to deliver the best possible viewing experience, even when faced with imperfect HTML.

And it's evident that Next capitalizes on this inherent browser behavior when implementing Component Streaming. By pushing chunks of content as they become available and leveraging browsers' ability to interpret and render partial or even slightly malformed content, Next.js ensures faster perceived load times and enhances user experience.

The strength of this approach lies in its alignment with the realities of web browsing. Users generally prefer immediate feedback, even if it's incremental, over waiting for an entire page to load. By sending parts of a page as soon as they're ready, Next.js optimally meets this preference.

Now, observe this segment:

<!--$?-->
  <template id="B:0"></template>
  Waiting for MyComponent...
<!--/$-->
Enter fullscreen mode Exit fullscreen mode

We can spot our placeholder text adjacent to an empty <template> tag bearing the B:0 id. Further, we can discern that the response from localhost:3000 is still underway. The trailing script tag remains unclosed. Next.js uses a placeholder template to make room for forthcoming HTML that will be populated with the next chunk.

After the next chunk has arrived, we now have the following markup (I’ve added some newlines to make it more readable)...

Don't attempt to unminify the code of the $RC function in your head. This is the completeBoundary function, and you can find a commented version here.

<p>Some text</p>

<!--$?-->
<template id="B:0"></template>
Waiting for MyComponent...
<!--/$-->

<!-- <script> tags omitted -->

<div hidden id="S:0">
  <p>Denis</p>
</div>

<script>
  $RC = function (b, c, e) {
    c = document.getElementById(c);
    c.parentNode.removeChild(c);
    var a = document.getElementById(b);
    if (a) {
      b = a.previousSibling;
      if (e)
        b.data = "$!",
          a.setAttribute("data-dgst", e);
      else {
        e = b.parentNode;
        a = b.nextSibling;
        var f = 0;
        do {
          if (a && 8 === a.nodeType) {
            var d = a.data;
            if ("/$" === d)
              if (0 === f)
                break;
              else
                f--;
            else
              "$" !== d && "$?" !== d && "$!" !== d || f++
          }
          d = a.nextSibling;
          e.removeChild(a);
          a = d
        } while (a);
        for (; c.firstChild;)
          e.insertBefore(c.firstChild, a);
        b.data = "$"
      }
      b._reactRetry && b._reactRetry()
    }
  }
  ;
  $RC("B:0", "S:0")
</script>
Enter fullscreen mode Exit fullscreen mode

We receive a hidden <div> with the id="S:0". This contains the markup for <MyComponent />. Alongside this, we are presented with an intriguing script that defines a global variable, $RC. This variable references a function that performs some operations with getElementById and insertBefore.

The concluding statement in the script, $RC("B:0", "S:0"), invokes the aforementioned function and supplies "B:0" and "S:0" as arguments. As we've deduced, B:0 corresponds to the ID of the template that previously held our fallback. Concurrently, S:0 matches the ID of the newly acquired <div>. To distill this information, the $RC function essentially instructs: "Retrieve the markup from the S:0 div and position it where the B:0 template resides."

Let's refine that for clarity:

  1. Initiating the Chunked Transfer: Next.js begins by sending the Transfer-Encoding: chunked header, signaling the browser that the response length is undetermined at this stage.
  2. Executing Home Page: As the Home page executes, it encounters no await operations. This means no data fetching is blocking the response from being sent immediately.
  3. Handling the Suspense: Upon reaching the tag, it uses the fallback value for immediate rendering, while also inserting a placeholder <template /> tag. This will be used later to insert the actual HTML once it's ready.
  4. Initial Response to the Browser: What's been rendered so far is sent to the browser. Yet, the "0␍␊␍␊" sequence hasn't been sent, indicating the browser should expect more data to come.
  5. Component Data Request: The server communicates with MyComponent, requesting its data and essentially saying, "We need your content, let us know when you're ready."
  6. Component Rendering: After MyComponent fetches its data, it renders and produces the corresponding HTML.
  7. Sending the Component's HTML: This HTML is then sent to the browser as a new chunk.
  8. JavaScript Attachment: The browser's JavaScript then appends this new chunk of HTML to the previously placed tag from step #3.
  9. Termination Sequence: Finally, the server sends the termination sequence, signaling the end of the response.

Diving into Multiple <Suspense />

Handling a singular <Suspense /> tag is straightforward, but what if a page has multiple of them? How does Next.js cope with this situation? Interestingly, the core approach doesn't deviate much. Here's what changes when managing multiple <Suspense /> tags:

Fallbacks at the Forefront: Each <Suspense /> tag comes equipped with its own fallback. During the rendering phase, all these fallback values are leveraged simultaneously, ensuring that every suspended component offers a provisional visual cue to the user. This is an extension of the third point from our previous list.

Unified Request for Content: Just as with a single <Suspense />, Next.js sends out a unified call to all components wrapped within the <Suspense /> tags. It's essentially broadcasting, "Provide your content as soon as you're ready."

Waiting for All Components: The termination sequence is of utmost importance, signaling the end of a response. However, in cases with multiple <Suspense /> tags, the termination sequence is held back until every single component has sent its content. This ensures that the browser knows to expect, and subsequently render, the content from all components, providing a holistic page-view to the end user.

The advent of features like <Suspense /> in Next.js underscores the framework's dedication to enhancing user experience. By tapping into the innate behavior of browsers and optimizing content delivery, Next.js ensures users encounter minimal wait times and see content as swiftly as possible. This deep dive into the inner workings of component streaming and chunked transfer encoding reveals the intricate dance of protocols, rendering, and real-time adjustments that takes place behind the scenes. As web developers, understanding these nuances not only makes us better at our craft but also equips us to deliver seamless and responsive digital experiences for our users. Embrace the future of web development with Next.js, where efficiency meets elegance.

Top comments (0)