Christian Nwamba

Posted on Oct 30, 2023 • Edited on Nov 1, 2023

Can Next.js Handle 5000 Pages?

#javascript #git #linux #typescript

I wanted to share an experiment that drives Next.js 13 SSR and SSG features to its limit. I built a website with 5000 SSR pages to see how Next.js will perform locally and in production. I work on the AWS Amplify service team, and I wanted to use our hosting service to build and deploy these pages and make sure it can handle a high page and image count. How much difference does it make to use static params vs dynamic params? Which one is optimal if you need a fast build time? Which one is optimal if you don’t care about build time but want a blazing fast website? Let’s find out.

By the end of this article, you'll see how I generated 5000 test images and records, uploaded them for delivery to Amazon S3 and Dynamo DB using AWS Amplify, built a Next.js app to fetch the data and images, render them on screen, deployed to Amplify Hosting and took record of performance impact and differences.

Sourcing and Uploading 5000 Pieces of Data

TL;DR: If you prefer not to follow the process outlined in this section, you can follow these links to access my 5000 records and 5000 images.

How do you find 5k Images?
The first challenge I faced was finding 5000 images for each of the pages. Instead of relying on datasets from platforms like Kaggle, I used a more direct approach. I downloaded ten images from Unsplash and kept them in a local folder called 5k_src.

If you’ve used Unsplash, you would recall that these raw images are typically high-resolution and large, often exceeding 1MB. I optimized each image to reduce its file size. Make sure you have your own set of 10 optimized images in a folder of your choice.

Once I had 5000 images, I wrote a script to duplicate these images until I had a total of 5000. It wasn't important to me that each image be unique, so I decided to use ten different images and duplicate them until I had 5000 images. To take advantage of this script, I would need to create a src folder for the 10 images and the destination folder for the 5k images. Create a folder called demo and add two folders to it: 5k_dest, 5k_src.

Switch into the demo folder and run this CLI script in your terminal to duplicate the images:

dest="5K_dest"
src="5K_src"
for i in {1..500}; do cp "$src/b.jpg" "$dest/b$i.jpg"; done

The script sets a destination folder named 5K_dest to store all the copied images. It also specifies a source folder, 5K_src, containing the original image. Then, it runs a loop for 500 iterations.

Each iteration duplicates the image named b.jpg from the source folder and saves the duplicate in the destination folder with a unique name. The new names include a number (the value of i) that increases with each iteration, resulting in images named "b1.jpg", "b2.jpg", and so on up to "b500.jpg".

I used this approach to ensure that each image has a unique filename. By running this script for each image, I got a total of 5000 images. To get the 5000 duplicate images, you can run the script on each image in your 5k_src folder.

How do you find 5k Records?

Next, I needed to generate a list of random records. I used a tool called Mocaroo, which allows you to generate at most 1000 rows of records at a time for a free account.

After accessing Mocaroo, I cleared all of its default data and added new fields as shown in the image below.

After adding the new fields, I set the Rows field to 1000 which is the max Mockaroo can generate. Once I had that setup, I clicked Generate Data 5 times to generate a 5 csv files with each containing 5000 records.

Now to merge all of these 5 csv files, I had to turn to Google Sheets. If you are following along, you can do the same with the following steps:

Open Google Sheets.
Create a new file.
Click on "File" at the top left.
Click “Import” and select "Upload" and choose the first csv file.
Once the first csv file is done uploading, repeat the same step but when the pop up option for merging appears, import the rest to the same sheet and append to the end of the sheet.

Once I was done importing, I used Autofill to add the image names to each record. This was fine since the images are sequential.

How do you upload 5k Images to the Cloud?

The next thing I needed to do was upload the 5k images to the cloud, specifically Amazon Simple Storage Service (Amazon S3), and I had to first create an AWS Amplify project. To do that, follow these steps:

Navigate to your AWS console and search for AWS Amplify.
Select AWS Amplify to open the Amplify Console.
In the upper right-hand corner, select New app and choose Build an app from the dropdown menu.

Give the app a name (I called mine 5kpages) and click Confirm Deployment to deploy it.

Once the deployment is completed, click the Launch Studio button to open the Amplify studio.

The next thing I needed to do was create a storage instance to store the images. However, before I could do that, I needed to set up authentication.

To proceed with the authentication setup, click the Set up button. You can leave all the default selections as we won't be using authentication for this app; it is only required for using storage. Go ahead and click the Deploy button, acknowledge the warning, and select Confirm Deployment.

The authentication deployment process should take a minute or two. Once completed, you will see a confirmation message stating that authentication has been successfully deployed.

After setting up authentication, I set up storage and created a new S3 bucket. To do this, select the Storage option in the setup menu on the screen's left side.

In the authorization settings, ensure that signed-in users have permission to upload, view, and delete files, while guest users can only view and delete files. Finally, click the Create bucket button.

To view the bucket, you can navigate back to your Amplify console. Search for s3 and then select it.

I called my bucket 5kpages. Select yours to open it.

To upload the images to the bucket, click the Upload button.

To upload the 5k images to the S3 bucket I dragged and dropped the 5k_dest folder onto the page, and clicked the Upload button as shown below.

It took some time to upload all the images.

Once the upload is complete, return to the Amplify studio and select the File browser option in the side menu. In the public folder, you will find the 5k_dest folder containing all 5,000 images. You can browse through the pages to view the images.

How do you upload 5k Records to the Cloud?
After I uploaded the images, the next step was to upload the 5,000 records to Amazon DynamoDB. To do this, follow these steps:

Go back to your Amplify console.
Select Data from the side menu.
Click on the "+ Add model" button.
Fill in the fields as shown in the image below.
After filling in the fields, click the Save and Deploy button.

After creating the model, go to your Amplify console and search for DynamoDB. Click Tables on the side menu, you will find a product table with an item count of 0. Copy the name of the table and head over to your terminal to create a Node app.

Before creating the Node app, I needed to downloaded the 5kproducts.csv records file from Google Sheets.
To download the file, follow these steps:

Open the Google Sheets containing the 5000 records.
Click on "File" at the top left.
Select "Download" and then choose "Download as CSV".

After downloading the file, create a new folder and name it 5kdyno or any other preferred name. Place the downloaded CSV file inside this folder. Next, create a package.json file inside the folder and add the following to it:

{
  "name": "5kdyno",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "keywords": [],
  "author": "",
  "license": "ISC",
  "dependencies": {
    "@aws-sdk/client-dynamodb": "^3.423.0",
    "@aws-sdk/lib-dynamodb": "^3.423.0",
    "aws-sdk": "^2.1468.0",
    "csv-parse": "^5.5.0",
    "csv-reader": "^1.0.12",
    "uuid": "^9.0.1"
  },
  "type": "module"
}

Run the following command in your terminal to install those dependencies:

npm install

Create an index.js file inside the folder and add the following to it:

import { DynamoDBClient } from "@aws-sdk/client-dynamodb";
import { PutCommand, DynamoDBDocumentClient } from "@aws-sdk/lib-dynamodb";
import { v4 } from "uuid";

import { parse } from "csv-parse";
import { createReadStream } from "fs";

const client = new DynamoDBClient({});
const docClient = DynamoDBDocumentClient.from(client);

export const main = async () => {
  const parser = createReadStream("5kproducts.csv").pipe(parse());
  for await (const row of parser) {
    console.log(row);
    const command = new PutCommand({
      TableName: "Product-f6gl4tj2gzffyfynoogu-staging",
      Item: {
        id: v4(),
        createdAt: new Date().toJSON(),
        desription: row[1],
        img: row[6],
        name: row[0],
        price: row[4],
        quantity: row[2],
        size: row[3],
        updatedAt: new Date().toJSON(),
        __typename: "Product",
      },
    });
    const response = await docClient.send(command);
    console.log(response);
  }
};
main();

The first thing you might notice is that I am not passing any credentials to the Dynamo DB client. You can checkout this AWS doc to guide you on how to set up credentials for your computer.

The next thing I did was to create an instance of the DynamoDBClient and use this instance to create a DynamoDBDocumentClient. This client provides a higher-level interface for working with DynamoDB.

Next, I declared an asynchronous function called main, and set up a stream to read the file 5kproducts.csv row by row and pipe this stream into a CSV parser. Afterward, I loop through each row of the CSV file as it is parsed and log each of them to the console.

The PutCommand creates a write operation to the DynamoDB table using the rows from the csv. Finally I sent the write operation to Dynamo DB using the client

Don’t forget to replace the TableName with the name of the table you created.

Now, if you open your terminal and run the command node index.js, you will see that it is writing to the specified table.

This should take a few minutes to complete, but once it is done, go back to DynamoDB and refresh the page. Click on "Explore table items," and you should be able to see the items in the database.

Now, if you go back to Amplify and click the "Content" section from the side menu, you will be able to see the incoming content.

Rendering Data on the Browser

After uploading the images and records to the cloud, the next thing I needed to do was to display the list of products on a website. To accomplish this, I created a Next.js app. Instead of starting from scratch, I created a starter project that includes Tailwind configurations and other setups. You can clone it by running the following Degit command in your terminal:

npx degit christiannwamba/5kpages#starter

The starter project contains three components and a ui/button component. One of the components is the ProductList, which receives items, loops through them, and renders them. Within the ProductList component, the S3Image component is used to fetch images from S3. Additionally, the project includes a Pagination component that manages pagination.

After that, run the following command to install the dependencies:

npm install

The next thing I did was to configure Amplify so we can access the data and images we uploaded. To do that, go back to Amplify Studio, and copy the pull command displayed.

From your projects directory, paste the copied command into the terminal and then execute the command you copied.

When you run the command, you will be redirected to your web browser to grant the CLI access. Once there, click 'Yes' to authenticate with Amplify Studio.

After that, return to the CLI. Here, you will be asked a series of questions to gather essential details about your project's configuration. Accept the default values highlighted in the image below:

Next, I configured Next.js so that it recognizes the source domain for the images. To do that, add the following to your next.config.js file:

/** @type {import('next').NextConfig} */

const nextConfig = {
  images: {
    domains: [
      "5kproducts-storage-dd7c40fc142146-staging.s3.us-east-1.amazonaws.com",
    ],
  },
};

module.exports = nextConfig;

Don’t forget to replace the domains with the one you copied from your s3 bucket.

Once I had set up an Amplify project, I needed to generate the code for the GraphQL Operations that we can use to interact with our data.

To generate these operations, run the following command at the root of your project:

amplify add codegen

Accept the default values highlighted in the image below:

This command will generate the GraphQL operations and save them in the graphql directory within src/graphql. The generated operations are ready to be imported into your components for seamless interaction with your API.

To fetch and render the list of products, add the following to your app/page.js file:

import { API } from "aws-amplify";

import Pagination from "@/components/Pagination";
import ProductList from "@/components/ProductList";
import * as queries from "../src/graphql/queries";

async function fetchData(nextToken, prevToken, action) {
  const variables = {
    limit: 10,
  };
  if (action == "next" && nextToken) variables.nextToken = nextToken;
  if (action == "prev" && prevToken) variables.nextToken = prevToken;

  const allProducts = await API.graphql({
    query: queries.listProducts,
    variables,
  });

  return allProducts.data.listProducts;
}

async function Home({ searchParams }) {
  const nextToken = searchParams.nextToken;
  const prevToken = nextToken;
  const action = searchParams.action;

  const products = await fetchData(nextToken, prevToken, action);

  return (
    <div className="w-[800px] mx-auto py-24">
      <h1 className="text-2xl text-center pb-8">Products</h1>
      <ProductList items={products.items} />
      <Pagination nextToken={products.nextToken} prevToken={prevToken} />
    </div>
  );
}

export default Home;

In this file, I'm importing several things. First, I import API from the AWS Amplify library. This allows me to interact with the GraphQL API. I also import the Pagination and ProductList components, which are React components used to display the list of products and provide pagination functionality, respectively. Additionally, I import queries from the ../src/graphql/ queries file.

Next, I define an asynchronous function called fetchData that fetches product data based on provided parameters. In the function, I define an object called variables with a limit of 10, indicating that I want to fetch 10 products at a time.

I then modify variables for pagination based on the provided action and token. If the action is next and nextToken is provided, variables.nextToken is set to the value of nextToken, which fetches the next set of products. If the action is prev and prevToken is provided, it uses the prevToken to fetch the previous set of products.

The function then makes a GraphQL call using the queries.listProducts query and the variables to fetch the products data. Finally, it returns the fetched products list.

Next, I define the Home component, which serves as the main React component for displaying the product list on the homepage. This component is asynchronous and retrieves nextToken, prevToken, and action from the searchParams.
To fetch the required products list, the component calls the fetchData function with these parameters.

The returned JSX includes the page title, the list of fetched products passed to the ProductList component, and the Pagination component for handling pagination controls.

If you go to your browser now, you will see the displayed products. If you click the Next button, you can view more products, and the 'previous' button will show you the Previous set of products.

Render Dynamic Page with Dynamic Params
I wanted to explore both dynamic parameters and statically generated pages and also compare the performance difference.

For dynamic parameters, we need to receive the id as a parameter. With this, we can start building the page to render individual product items.

Create a new folder called [id] in your app folder. Inside the [id] folder, create a page.js file and add the following code to it:

import { API } from "aws-amplify";
import * as queries from "@/src/graphql/queries";
import S3Image from "@/components/S3Image";
import Link from "next/link";
import { Button } from "@/components/ui/button";

async function Product({ params }) {
  console.log(params);
  const variables = {
    id: params.id,
  };
  const res = await API.graphql({
    query: queries.getProduct,
    variables,
  });

  const product = res.data.getProduct;
  return (
    <div className=" p-4 w-1/2">
      <div className="flex items-center mb-4">
        <div className="h-96 w-80 bg-slate-400 relative">
          <S3Image imageName={product.img} />
        </div>
        <div className="ml-4">
          <p className="text-lg font-semibold pb-2 text-slate-800">
            ${product.price}
          </p>
          <p className="text-xs">{product.quantity} left</p>
          <h3 className="text-lg">{product.name}</h3>
          <div>{product.desription}</div>

          <Link href={`/`} className="mt-8 block">
            <Button className="w-full">Go back</Button>
          </Link>
        </div>
      </div>
    </div>
  );
}

export default Product;
export const revalidate = 60 * 50;

The file begins with several imports. The API import is from the AWS Amplify library and is used to interact with the GraphQL API. The queries import is used to fetch details of a product. The S3Image import is used to display images stored in Amazon S3. The Link import from next/link is used for client-side transitions between routes. Lastly, the Button component is imported for UI purposes.

Next, I define an asynchronous component named Product. It begins by defining a variables object and using the id from params to set the variables. This id is used to query a specific product.

Next, the function makes a GraphQL call using the queries.getProduct query and the variables. The response contains the product data, which is extracted and stored in the product constant.

Finally, the returned JSX includes the following for each product: An image of the product using the S3Image component, the product's price the quantity left in stock, the product's name, the product's description, and a button that, when clicked, navigates the user back to the home page.

I also defined a constant named revalidate that specifies how often (50 minutes) Next.js should re-check for new data on a page. It is important to revalidate Next.js to ensure that your responses are not cached for too long. If you have content that needs to change frequently, caching can present challenges in keeping the content fresh.

In this demo, the main issue with caching is related to using Amazon S3 for image storage. S3 does not provide a public URL directly. Even if it gives you a public URL, it is signed with a token that expires. Once the token expires, you can no longer access the image using the same URL and must request a new token.

This means that if Next.js caches your response and holds the URL, it will not know that the image is now invalid. When you try to render the image, it will not display anything. Unless you ask Next.js to refresh its cache and make a fresh request to storage for a new URL.

To address this, I have set the expiry time for the S3 images to one hour and configured Next.js to revalidate the image after 50 minutes. This way, Next.js makes a fetch request before the S3 images expire and invalidates the cache and show the updated page.

Render Dynamic Page with Static Params
So far, we’ve seen how to achieve dynamic rendering with dynamic parameters. Now, let’s explore how to achieve dynamic rendering with static parameters.

To do this, go to your code editor and commit changes you’ve made to the main branch. To create a new branch, run the following command in your terminal:

git checkout -b static

Open your app/[id]/page.js and add following code to the bottom of the file:

export async function generateStaticParams() {
  const variables = {
    limit: 5000,
  };

  const allProducts = await API.graphql({
    query: queries.listProducts,
    variables,
  });
  const items = allProducts.data.listProducts.items;

  return items.map((item) => ({
    id: item.id,
  }));
}

This function fetches a list of all products (5000 of them) using the queries.listProducts GraphQL query. It then maps over the list of products and extracts the IDs.
The returned array of IDs would be used by Next.js for static generation at build time, creating a pre-rendered page for each product ID.

If you visit your browser, refresh the page, and randomly click on items, you will notice a significant improvement in loading speed which is a result of the implementation of static generation.

Building Locally

Now let's take stock of what we've done so far and compare the build times for dynamic parameters versus static parameters.

To switch back to the main branch, run the following command in your terminal:

git checkout main

Run the following command in your terminal to see the build time for dynamic parameters:

npm run build

As shown in the image below, the build process generated only 5 static pages and was completed in 12 seconds.

Let’s do the same for static parameters. Run the following command in your terminal:

git checkout static
npm run build

You should notice that it is iterating and attempting to build each of the pages. In the image below, you can see that the build time took 3 minutes, which is equivalent to 180 seconds.

Building in Production

The next step is to deploy to production and test. Return to your code editor and publish the static branch. After that is done, publish the main branch as well.

Go to your AWS console and search AWS Amplify and select it from the list of services. Next, select the 5kpages app.

Select Hosting environments.

Select GitHub or your Git provider on the next page and click the Connect branch button.

Select the repository you intend to host, select the main branch for dynamic parameters, and click the Next button to proceed.

In the build settings page, select an environment or Create a new environment. Select an existing service role if you have one, or click the Create new role button to create a new role that allows Amplify Hosting to access your resources. Once you have made your selections, click the Next button to continue.

Review your Repository details and your App settings then click Save and deploy.

Dev and User Experience Insights

While I was building and deploying static and dynamic versions of the app, I was keeping track of a few numbers. I wanted to see if there are any useful insights when I consider the following:

How fast is the build time?
What is the perceived load speed for a visitor?

Dev Experience

How fast is the build time?

Measuring the build time was easy — iTerm (local) and AWS Amplify (production) had timestamps during the build and all I needed to do was subtract the times. The following table indicates that that dynamic parameters are faster than statically generated pages at build time.

	Build Time (seconds)	Deploy Time (seconds)
Dynamic Params	12	58
Static Params	180	240

(The deploy time accounted for only building the Next.js app and does not include provisioning and backend build time.)

The reason for the difference in build vs deploy time is clear — Next.js has to build each of the 5000 pages if I want it to be statically generated.

Develop in Dynamic Mode, Release in Static Mode

A strategy I stumbled upon that no one talks about is to not call the generateStaticParams when you are testing things out in a non-production environment. Waiting for 5k pages to build on a staging server is a painful developer experience.

My recommendation is to only enable generateStaticParams in production since you’d hardly build production frequently.

User Experience

What is the perceived load speed for a visitor?

To get insights on what difference it makes to have a statically generated site and a dynamic site, I analyzed the website with PageSpeed Insights. Here’s my log for the server and cache speed index of the home page, dynamic params page, and statically generated pages.

For context, I analyzed 6 websites from the Next.js Showcase and the average Speed Index was 2.6 seconds.

	Home (seconds)	Dynamic Params (seconds)	Static Params (seconds)
Speed Index (Server)	3.7	2.9	2.4
Speed Index (Cache)	3.1	2.1	1.6

(Server speed index was taken immediately after deployment was successful. Cache speed index was taken by analyzing again)

My first observation was that the home pages were relatively slower than the dynamic pages because they have more images to render.

As expected, statically generated pages load 1.2x faster than dynamic parameters. In a real world scenario, a page will have more text, font, and images compared to my demo. So take 1.2x with a pinch of salt because the difference will definitely be more significant in such cases.

Other User Experience Metrics

Speed Index gives you a wholistic view but it is not enough to consider when you measuring your site’s performance. What is speed is a great article to understand why and what to also consider. Taking this into consideration, I decided to leave a dump of all the images from PageSpeed Insights in case you want to dig deeper.

Home page from server:

Dynamic params page from server:

Statically generated page from server:

Home page from cache:

Dynamic params page from cache:

Static params page from cache:

Clean Up

To ensure that you don’t have any unused resources in you AWS account, run the following command to delete all the resources that were created in this project if you don’t intend to keep them.

amplify delete

My Opinion

This was an insightful experiment for me and it led to some interesting conclusions that will guide me in the next few months when using Next.js.

For my customers’ experience: I will always default to having statically generated pages in production with the exception of when the dynamic pages content changes frequently. Additionally, Amplify Hosting met my expectation by building all of the pages with no errors and no additional delays.

For my dev experience: I am going to ignore static page generation when working on dynamic pages. It’s ok for it to be an afterthought since Next.js has designed generateStaticParams to be pluggable. If you’d like to learn more about Amplify hosting, here is a guide on how to get started with it.

Top comments (7)

Konadu Akwasi Akuoko • Nov 1 '23

Wow 🤩😍
I'm just blown away by the amount of work you've put into this. This article is so awesome 😎 If we can get a video I'll be grateful 😁

I'm blown away by the page speed insights for such a big site, it's insane how everything works out of the box most of the time.

If I want to clarify, please this means you did not use Vercel for hosting, but rather AWS amplify right?

Necmettin Begiter • Nov 6 '23

Correct me if I'm wrong please.

The homepage lists 10 products with one image each, and it takes 3.1 seconds to load entirely even when the page is loading from the browser cache?

And a dynamic page lists a single product with a single image, and it takes 2.1 seconds to load entirely?

If these numbers are as I described, then this is too slow. I have PHP code that creates an entire page in 3 seconds; from zero to completed with a multi-level panel menu, search form with multiple selectboxes and datepickers, and 100 records with related fields included and multiple buttons for each record (that are created on the server based on multiple fields of that specific record), and at least 20 images.

Is this because of NextJS?

Raí B. Toffoletto • Nov 1 '23

Amazing article ! Thank you for the effort and for sharing it. It would also be interesting to do a stress test with multiple connections to see how it behaves.

Martin • Nov 5 '23

Thanks for this post!

To address this, I have set the expiry time for the S3 images to one hour and configured Next.js to revalidate the image after 50 minutes. This way, Next.js makes a fetch request before the S3 images expire and invalidates the cache and show the updated page.

I think this would not work in production since revalidation only happens when a page is hit with a user's request. Next itself does not schedule revalidation compared to a cron job, the revalidation duration is just the minimum cache time and can be compared to stale-while-revalidate caching.