DEV Community

Cover image for Generate sitemap.xml and robots.txt in Nextjs
Bojan Jagetic
Bojan Jagetic

Posted on • Updated on • Originally published at bojanjagetic.com

Generate sitemap.xml and robots.txt in Nextjs

Introduction

When you make your site you want it to be first in google search results or in other words we need to make improvements to our Search Engine Optimization (SEO).
Google rank websites by many different reasons but one of the most important is that it knows our site and know what to expect on it. That is the reason why we
need sitemap.xml and robots.txt.

Robot.txt tells Google crawler which files it can request from website and which cannot.

Sitemap

Lets begin with what sitemap represent and how it works.

A sitemap is a file where you provide information about the pages, videos, and other files on your site, and the relationships between them. Search engines like Google read this file to crawl your site more efficiently. A sitemap tells Google which pages and files you think are important in your site, and also provides valuable information about these files.

What sitemap.xml do is that is basically defining relationship between pages on website. Search engines utilize this file to more accurately index your site. You can add additional
things like when was the last time it was updated, how frequently the pages changes, priority, etc.

β€œsitemap”/

Static Sitemap

When you have static website, static sitemap will do the job.In other words when your website does not change frequently you can make simple .xml file
for defining and telling google crawler which content you have.

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
      <loc>https://yourapp.com</loc>
  </url>
  <url>
      <loc>https://yourapp.com/blog</loc>
  </url>
  <url>
      <loc>https://yourapp.com/libary</loc>
  </url>
  <url>
      <loc>https://yourapp.com/contact</loc>
  </url>
</urlset>
Enter fullscreen mode Exit fullscreen mode

Dynamic Sitemap

On the other hand, if your site frequently changes, you need to make dynamic sitemap. You can do it manually by generating .xml file after fetching all your files, but in this post we will cover the easier way of doing so.
There is great npm module called next-sitemap which is doing all dirty work for you.

First you need to install it by using following command:

yarn add next-sitemap
Enter fullscreen mode Exit fullscreen mode

Create site map configuration file for next-sitemap to use. There are many properties available but we will use these three:

  • siteUrl- used for setting base URL of your website
  • generateRobotsTxt- Generate a robots.txt file and list the generated sitemaps. Default false
  • sitemapSize- Split large sitemap into multiple files by specifying sitemap size. If number of URLs reach over default it will create new sitemap.xml so you will have sitemap-0.xml and sitemap-1.xml,etc. Default is 5000.
module.exports = {
  siteUrl: process.env.SITE_URL || 'https://yourapp.com',
  generateRobotsTxt: true, // (optional)
  sitemapSize: 7000
}
Enter fullscreen mode Exit fullscreen mode

In your package.json add postbuild script which will be automatically triggered after succesfull build where we will start next-sitemap command.

{
  "build": "next build",
  "postbuild": "next-sitemap"
}
Enter fullscreen mode Exit fullscreen mode

Output

After build is done you will have generated sitemap.xml and sitemap-0.xml at public folder which means that it is static directory and everything inside of public will be exposed in root domain level. This means that URL of our sitemap.xml and robots.txt will not be /public/sitemap.xml, but it will be just /sitemap.xml due to files being in public folder. If you set generateRobotsTxt to true you will get robots.txt file as well.
If check sitemap.xml you should see:

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap><loc>https://bojanjagetic.com/sitemap-0.xml</loc></sitemap>
</sitemapindex>
Enter fullscreen mode Exit fullscreen mode

As you can see there is only one location referencing to our sitemap-0.xml. Lets open and check content of it:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:news="http://www.google.com/schemas/sitemap-news/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:mobile="http://www.google.com/schemas/sitemap-mobile/1.0" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" xmlns:video="http://www.google.com/schemas/sitemap-video/1.1">
<url><loc>https://bojanjagetic.com</loc><lastmod>2022-12-03T20:01:39.202Z</lastmod><changefreq>daily</changefreq><priority>0.7</priority></url>
<url><loc>https://bojanjagetic.com/routes/aboutme</loc><lastmod>2022-12-03T20:01:39.203Z</lastmod><changefreq>daily</changefreq><priority>0.7</priority></url>
<url><loc>https://bojanjagetic.com/routes/blog</loc><lastmod>2022-12-03T20:01:39.203Z</lastmod><changefreq>daily</changefreq><priority>0.7</priority></url>
<url><loc>https://bojanjagetic.com/post/npm-vs-yarn</loc><lastmod>2022-12-03T20:01:39.203Z</lastmod><changefreq>daily</changefreq><priority>0.7</priority></url>
<url><loc>https://bojanjagetic.com/post/programming-concepts</loc><lastmod>2022-12-03T20:01:39.203Z</lastmod><changefreq>daily</changefreq><priority>0.7</priority></url>
<url><loc>https://bojanjagetic.com/libary/crypto-scrapper</loc><lastmod>2022-12-03T20:01:39.203Z</lastmod><changefreq>daily</changefreq><priority>0.7</priority></url>
<url><loc>https://bojanjagetic.com/libary/github-card-npm-component</loc><lastmod>2022-12-03T20:01:39.203Z</lastmod><changefreq>daily</changefreq><priority>0.7</priority></url>
... 
Enter fullscreen mode Exit fullscreen mode

So as you can see it generated all routes that I have so Google crawler knows which resources are available.

Robots

As we mention already, robots.txt will tell Google crawler which files and resources can be requested and location of sitemap. Content of generated robots.txt is something like following:

# *
User-agent: *
Allow: /

# Host
Host: https://bojanjagetic.com

# Sitemaps
Sitemap: https://bojanjagetic.com/sitemap.xml
Enter fullscreen mode Exit fullscreen mode

Validation

When we have everything done and deployed, we can validate and double-check our past work.
For validating sitemap XML you can do it on XML-sitemap

For validating robots.txt you can use Google search robots.txt tester

Conclusion

Know that we have sitemap.xml and robots.txt we can know get better visibility on Google search and it will be better ranked, which means we will get more visitors.

Top comments (1)

Collapse
 
lovepreetsingh profile image
Lovepreet Singh

πŸ”₯ Good One, Really Helpful

Kindly Checkout my blog too and provide feedback if possible

πŸ“ dev.to/lovepreetsingh/what-is-dock...