DEV Community

Cover image for Next.js: How to Build Sitemap and Robots.txt files
Dave Gray
Dave Gray

Posted on • Originally published at davegray.codes on

Next.js: How to Build Sitemap and Robots.txt files

While setting up my Next.js blog, I've documented settings for metadata, favicons and canonical links which all reside in the <head> element. Today's topic does not reside inside the <head> element but is just as crucial for any website or blog: a sitemap.

And because one requires the other, we will also be creating a robots.txt file.

If you are not familiar with the importance of sitemaps and robots.txt files, you can learn about sitemaps on the Google Developers site.

Two Ways to Build a Sitemap

I have found two ways to build a sitemap for a Next.js website. I tried both, and I will guide you through them. In the end, I'll share which one I chose for my blog and why. Of course, your needs may be different than mine so learning about both will help your decision, too.

1. Generate a sitemap with next-sitemap

The npm package next-sitemap was how I first learned to build sitemaps in Next.js. It is a very useful dependency and is still maintained as of this writing (November, 2023).

You can install next-sitemap like this:

npm i next-sitemap
Enter fullscreen mode Exit fullscreen mode

Configure next-sitemap

After installing next-sitemap, you need to create a next-sitemap.config.js file in the root directory of your project. This is the same place your package.json file is.

Here's what my next-sitemap.config.js file looks like:

/** @type {import('next-sitemap').IConfig} */
module.exports = {
    siteUrl: 'https://www.davegray.codes/',
    exclude: ['/icon.svg', '/apple-icon.png', '/manifest.webmanifest', '/tags/*'],
    generateRobotsTxt: true,
    generateIndexSitemap: false,
    robotsTxtOptions: {
        policies: [
            {
                userAgent: '*',
                allow: '/',
            }
        ]
    }
}
Enter fullscreen mode Exit fullscreen mode

You can review what each setting above does on the next-sitemap npm page, but I think you can quickly see I'm excluding some files and the /tags/* path from the generated sitemap. I'm also telling it to generate a robots.txt file.

One key benefit of generating a Next.js sitemap with the next-sitemap package is that it supports a sitemap index. This can be beneficial for large websites and blogs. We're talking sites with thousands of pages.

Add to your post build

Your Next.js site needs to generate all static pages first, so creating the sitemap is not part of your build process. Instead, it gets generated in a postbuild process. We trigger that by adding the postbuild script to package.json:

"scripts": {
    "dev": "next dev",
    "build": "next build",
    "start": "next start",
    "lint": "next lint",
    "postbuild": "next-sitemap"
  },
Enter fullscreen mode Exit fullscreen mode

And now you can type npm run build in your terminal to build your site and generate a sitemap.xml file and a robots.txt file that you will find in your /public directory.

My sitemap.xml generated by next-sitemap:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:news="http://www.google.com/schemas/sitemap-news/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:mobile="http://www.google.com/schemas/sitemap-mobile/1.0" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" xmlns:video="http://www.google.com/schemas/sitemap-video/1.1">
    <url><loc>https://www.davegray.codes</loc><lastmod>2023-11-14T19:24:05.794Z</lastmod><changefreq>daily</changefreq><priority>0.7</priority></url>
    <url><loc>https://www.davegray.codes/posts/does-my-nextjs-blog-need-canonical-links</loc><lastmod>2023-11-14T19:24:05.794Z</lastmod><changefreq>daily</changefreq><priority>0.7</priority></url>
    <url><loc>https://www.davegray.codes/posts/nextjs-favicon-svg-icon-apple-chrome</loc><lastmod>2023-11-14T19:24:05.794Z</lastmod><changefreq>daily</changefreq><priority>0.7</priority></url>
    <url><loc>https://www.davegray.codes/posts/nextjs-ordering-merging-metadata</loc><lastmod>2023-11-14T19:24:05.794Z</lastmod><changefreq>daily</changefreq><priority>0.7</priority></url>
    <url><loc>https://www.davegray.codes/posts/ssg-ssr</loc><lastmod>2023-11-14T19:24:05.794Z</lastmod><changefreq>daily</changefreq><priority>0.7</priority></url>
</urlset>
Enter fullscreen mode Exit fullscreen mode

My robots.txt file generated by next-sitemap:

# *
User-agent: *
Allow: /

# Host
Host: https://www.davegray.codes/

# Sitemaps
Sitemap: https://www.davegray.codes/sitemap.xml
Enter fullscreen mode Exit fullscreen mode

2. Generate a sitemap with Next.js

Next.js now offers built-in sitemap generation. This was first introduced in version 13.3 and updated in v13.5.

sitemap.ts

The sitemap.ts example in the Next.js docs shows the return type for the sitemap function by providing statically-typed data. However, when we build the function, we can add in the logic we need to generate the necessary return type instead of typing out the URLs one-by-one.

Here's what my sitemap.ts file looks like:

import { MetadataRoute } from 'next'
import { getPostsMeta } from '@/lib/posts'

export default async function sitemap(): Promise<MetadataRoute.Sitemap> {
    const allPosts = await getPostsMeta()

    const home = {
        url: 'https://www.davegray.codes/',
        lastModified: new Date().toString(),
    }

    if (!allPosts) return [home]

    const posts = allPosts.map(post => ({
        url: `https://www.davegray.codes/posts/${post.id}`,
        lastModified: post.modified,
    }))

    // Date of most recent post
    home.lastModified = allPosts[0].date

    return [home, ...posts]
}
Enter fullscreen mode Exit fullscreen mode

Above, I'm calling my getPostsMeta function to get the front matter data from my MDX files. After confirming I received the data, I map over it and create a sitemap entry for each blog post on my site. Finally, I assign the date of the most recent blog post to the lastModified field for the home page because the home page will display a link to the latest blog post.

The sitemap.ts file should be saved in the root of your app directory. This is the same location you will find your globals.css file in.

Generate a robots.txt file

Next.js also offers built-in robots.txt generation. This was first introduced in version 13.3.

Here is what my robots.ts file contains:

import { MetadataRoute } from 'next'

export default function robots(): MetadataRoute.Robots {
    return {
        rules: {
            userAgent: '*',
            allow: '/',
        },
        sitemap: 'https://www.davegray.codes/sitemap.xml',
    }
}
Enter fullscreen mode Exit fullscreen mode

Your robots.txt file should also be saved in the root of your app directory.

The Output

After creating these files, you can check their output in dev mode by typing npm run dev in your terminal.

Here is the output for my sitemap found at localhost:3000/sitemap.xml:

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <url>
        <loc>https://www.davegray.codes/</loc>
        <lastmod>2023-11-14</lastmod>
    </url>
    <url>
        <loc>https://www.davegray.codes/posts/does-my-nextjs-blog-need-canonical-links</loc>
        <lastmod>2023-11-14</lastmod>
    </url>
    <url>
        <loc>https://www.davegray.codes/posts/nextjs-favicon-svg-icon-apple-chrome</loc>
        <lastmod>2023-11-13</lastmod>
    </url>
    <url>
        <loc>https://www.davegray.codes/posts/nextjs-ordering-merging-metadata</loc>
        <lastmod>2023-11-12</lastmod>
    </url>
</urlset>
Enter fullscreen mode Exit fullscreen mode

Here is the output for my robots.txt file found at localhost:3000/robots.txt:

User-Agent: *
Allow: /

Sitemap: https://www.davegray.codes/sitemap.xml 
Enter fullscreen mode Exit fullscreen mode

Note that the process above is different than generating those files with next-sitemap. It is not a postbuild process. You can check the output while running in dev mode. In addition, it does not create static files in your public directory. You won't see these files on your computer. However, you will be able to view these files while running your project.

Which Sitemap Generation Method Did I Choose for My Blog?

Both approaches have their benefits and there is no one specific correct answer here.

In the end, I chose to go with the first-class support that is now built-in with Next.js.

Notable Differences: next-sitemap vs. Next.js sitemap generation

In comparison, you can see the output above from the Next.js generation method is more minimalistic that that of next-sitemap. The next-sitemap approach provides links to more XML schemas that I don't need. However, depending on your content, you might need one or more of those sitemap extensions.

The next-sitemap default config that I used also provides the <changefreq> and <priority> values. Next.js also supports these values (as of v13.4.5), but Google ignores changefreq and priority, so I don't feel like I need them.

Google does use the <lastMod> value "if it's consistently and verifiably accurate". For this property, I found next-sitemap could provide the same new Date() value for each URL, but with Next.js generation, I could provide an accurate modified date value in the front matter of my MDX files and extract that value to provide the "consistently and verifiably accurate" data that Google wants here.

In the robots.txt file, next-sitemap provides a Host value. I found that Google does not expect or require a Host value in your robots.txt file, and Host is not part of the original robots.txt specification. I didn't feel like I needed it in my robots.txt file after discovering this, and Next.js robots.txt generation does not include it.

If I had a large site (5000+ pages), I would have chosen next-sitemap. It currently makes it easy to generate a sitemap index, and as of this writing (November, 2023), Next.js does not support this. The Next.js docs do mention in a Good to know section that they do plan to support sitemap indexes in the future.


Let's Connect!

Hi, I'm Dave. I work as a full-time developer, instructor and creator.

If you enjoyed this article, you might enjoy my other content, too.

My Stuff: Courses, Cheat Sheets, Roadmaps

My Blog: davegray.codes

YouTube: @davegrayteachescode

X: @yesdavidgray

GitHub: gitdagray

LinkedIn: /in/davidagray

Buy Me A Coffee: You will have my sincere gratitude

Thank you for joining me on this journey.

Dave

Top comments (0)