loading...
Cover image for Auto-Generate sitemap.xml in Next.js

Auto-Generate sitemap.xml in Next.js

embiem profile image Martin Beierling-Mutz ・3 min read

Hurray! You created all the components and styling for your beautiful and performant Next.js website. What now?

There are some key files you want to serve in the root of your exported package, but Next.js only supports copying files from the /static folder out of the box. But how to add e.g. a sitemap.xml, even in an automated and always up-to-date way?

Let me show you how you could set this up for a 'yarn export'ed Next.js project.

Basic sitemap.xml structure

First, we'll need to take a look at the info a basic sitemap needs to have.

A list of...

  • URLs to each available page,
  • and an accompanying date, to let the search engine bot know where to find a page and when it was last changed.

That's it! If you want more info, you can check out Google's "Build and submit a sitemap" site.

Gathering needed info

Before we can write the file into our exported /out folder, we'll have to actually get the info we need: page url's & last modified dates.

To do this, I've built this function, which returns all files' paths inside the /pages folder:

module.exports = () => {
  const fileObj = {};

  const walkSync = dir => {
    // Get all files of the current directory & iterate over them
    const files = fs.readdirSync(dir);
    files.forEach(file => {
      // Construct whole file-path & retrieve file's stats
      const filePath = `${dir}${file}`;
      const fileStat = fs.statSync(filePath);

      if (fileStat.isDirectory()) {
        // Recurse one folder deeper
        walkSync(`${filePath}/`);
      } else {
        // Construct this file's pathname excluding the "pages" folder & its extension
        const cleanFileName = filePath
          .substr(0, filePath.lastIndexOf("."))
          .replace("pages/", "");

        // Add this file to `fileObj`
        fileObj[`/${cleanFileName}`] = {
          page: `/${cleanFileName}`,
          lastModified: fileStat.mtime
        };
      }
    });
  };

  // Start recursion to fill `fileObj`
  walkSync("pages/");

  return fileObj;
};

This will return an object, which looks like this for my website at the moment of writing:

{
  "/blog/auto-generate-sitemap-in-next-js": {
    "page": "/blog/auto-generate-sitemap-in-next-js",
    "lastModified": "2018-10-03T00:25:30.806Z"
  },
  "/blog/website-and-blog-with-next-js": {
    "page": "/blog/website-and-blog-with-next-js",
    "lastModified": "2018-10-01T17:04:52.150Z"
  },
  "/blog": {
    "page": "/blog",
    "lastModified": "2018-10-03T00:26:02.134Z"
  },
  "/index": {
    "page": "/index",
    "lastModified": "2018-10-01T17:04:52.153Z"
  }
}

As you can see, we have all the info we need to build our sitemap!

Creating the file when exporting

In Next.js, when you create your static files package, you'll typically run yarn build && yarn export. We want to hook-in after the export, to create the sitemap.xml file in the /out folder.

To hook into any scripts defined the package.json, we can add another script with the same name, but prefixed with "post";

The new package.json scripts section will look like this:

...
"scripts": {
    "dev": "next",
    "build": "next build",
    "start": "next start",
    "export": "next export",
    "postexport": "node scripts/postExport.js"
  },
...

I chose to create a new folder "scripts" and create the "postExport.js" file in there. This script will now run after every "yarn export" call.

Generate the sitemap.xml contents

This scripts/postExport.js file will utilize the function we created previously to get all needed info:

const pathsObj = getPathsObject();

Then, we'll create the sitemap.xml content & file:

const sitemapXml = `<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> 
  ${Object.keys(pathsObj).map(
    path => `<url>
    <loc>https://embiem.me${path}</loc>
    <lastmod>${
      formatDate(new Date(pathsObj[path].lastModified))
    }</lastmod>
  </url>`
  )}
</urlset>`;

fs.writeFileSync("out/sitemap.xml", sitemapXml);

That's it! Well almost. I did use a formatDate function, to get the desired string format for our date.

You could just do a .substr(), as pathsObj[path].lastModified already contains an ISO formatted date, or use some library like date-fns. I decided to copy a working solution from the web:

module.exports = function formatDate(date) {
  var d = new Date(date),
    month = "" + (d.getMonth() + 1),
    day = "" + d.getDate(),
    year = d.getFullYear();

  if (month.length < 2) month = "0" + month;
  if (day.length < 2) day = "0" + day;

  return [year, month, day].join("-");
};

Now run yarn export and a file at out/sitemap.xml appears!

Challenge! Create robots.txt

Based on this, it should be easy for you to create a robots.txt with your desired contents now.

If you want to know how I did it, check out the scripts folder in my website's repo.

Let me know whether you'd solve this differently.

Afterword

This was my first post in this community 👋. I plan to post much more in the future. You can find the original post on my Blog.

Posted on by:

embiem profile

Martin Beierling-Mutz

@embiem

I develop modern web applications, utilizing industry standard frameworks such as React, Next.js & NestJS.

Discussion

pic
Editor guide
 

Another example for a next js site. I've adjusted it slightly to also handle sub-folders:

const fs = require('fs');
const path = require('path');

const removeFileExt = (dirname) => {
  const parsedPath = path.parse(dirname);
  if (parsedPath.dir) {
    return `${parsedPath.dir}/${parsedPath.name}`;
  }
  return parsedPath.name;
};

module.exports = () => {
  const fileObj = {};

  const walkSync = (dir) => {
    const files = fs.readdirSync(dir);
    files.forEach((file) => {

      // specific for Next, skip files starting with '-' or between brackets '[...]'
      const partialNextFile = /^_\w*/;
      const dynamicNextFile = /^\[(.*?)\]/;
      if (partialNextFile.test(file) || dynamicNextFile.test(file)) {
        return;
      }

      const fullPath = `${dir}${file}`;
      const fileStat = fs.statSync(fullPath);

      if (fileStat.isDirectory()) {
        walkSync(`${fullPath}/`);
      } else {
        const parsedPath = path.parse(fullPath);
        if (!parsedPath.ext) {
          // skip system files
          return;
        }

        const filePathAry = path.format(parsedPath).split(path.sep);
        filePathAry.splice(0, 3);

        const fileDir = filePathAry.join('/');
        const cleanPath = removeFileExt(fileDir);

        fileObj[`/${cleanPath}`] = {
          page: `/${cleanPath}`,
          lastModified: fileStat.mtime,
        };
      }
    });
  };

  walkSync('./src/pages/');
  walkSync('./src/posts/');

  return fileObj;
};
 

Very nice! I've made some changes in my project, perhaps this is of any use to someone.

In the walkSync method I used the 'dir' parameter to replace pages/. This makes the method reusable for any other paths such as posts.

// ... not shown for brevity
const cleanFileName = filePath.substr(0, filePath.lastIndexOf('.')).replace(dir, '');
// ... not shown for brevity

Also, by using an early return I filtered out any files that do not compile to an actual file:

// ... not shown for brevity
files.forEach((file) => {
      const partialNextFile = /^_\w*/; // any filenames starting with _*
      const dynamicNextFile = /^\[(.*?)\]/; // any filenames between brackets [*]
      if (partialNextFile.test(file) || dynamicNextFile.test(file)) {
        // skip
        return;
      }
// ... not shown for brevity
 

Thanks! I think there's a bug in the code:

${Object.keys(pathsObj).map(
path => <url>
<loc>https://embiem.me${path}</loc>
<lastmod>${
formatDate(new Date(pathsObj[path].lastModified))
}</lastmod>
</url>

)}

Should be:

${Object.keys(pathsObj).map(
path => <url>
<loc>https://embiem.me${path}</loc>
<lastmod>${
formatDate(new Date(pathsObj[path].lastModified))
}</lastmod>
</url>

).join("")}

The original version is outputting commas between each XML element.

 

Hello Martin,

I am trying to add the sitemap and robots.txt files to my web NextJs app. I follow your post and I generated the two files without any problems but what is the next step to make accessible via URL the file in production ?

The two files are in the Out directory but they are not accessible in the url online.

Sorry for my bad english, I hope you will understand my problem.

Charlotte

 

Hey. This really depends on your deployment setup. As long as all files in the out folder are pushed to your static file host, you should be good.