Introduction
When you make your site you want it to be first in google search results or in other words we need to make improvements to our Search Engine Optimization (SEO).
Google rank websites by many different reasons but one of the most important is that it knows our site and know what to expect on it. That is the reason why we
need sitemap.xml
and robots.txt
.
Robot.txt
tells Google crawler which files it can request from website and which cannot.
Sitemap
Lets begin with what sitemap represent and how it works.
A sitemap is a file where you provide information about the pages, videos, and other files on your site, and the relationships between them. Search engines like Google read this file to crawl your site more efficiently. A sitemap tells Google which pages and files you think are important in your site, and also provides valuable information about these files.
What sitemap.xml do is that is basically defining relationship between pages on website. Search engines utilize this file to more accurately index your site. You can add additional
things like when was the last time it was updated, how frequently the pages changes, priority, etc.
Static Sitemap
When you have static website, static sitemap will do the job.In other words when your website does not change frequently you can make simple .xml file
for defining and telling google crawler which content you have.
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://yourapp.com</loc>
</url>
<url>
<loc>https://yourapp.com/blog</loc>
</url>
<url>
<loc>https://yourapp.com/libary</loc>
</url>
<url>
<loc>https://yourapp.com/contact</loc>
</url>
</urlset>
Dynamic Sitemap
On the other hand, if your site frequently changes, you need to make dynamic sitemap. You can do it manually by generating .xml file after fetching all your files, but in this post we will cover the easier way of doing so.
There is great npm module called next-sitemap which is doing all dirty work for you.
First you need to install it by using following command:
yarn add next-sitemap
Create site map configuration file for next-sitemap to use. There are many properties available but we will use these three:
-
siteUrl
- used for setting base URL of your website -
generateRobotsTxt
- Generate a robots.txt file and list the generated sitemaps. Default false -
sitemapSize
- Split large sitemap into multiple files by specifying sitemap size. If number of URLs reach over default it will create new sitemap.xml so you will have sitemap-0.xml and sitemap-1.xml,etc. Default is 5000.
module.exports = {
siteUrl: process.env.SITE_URL || 'https://yourapp.com',
generateRobotsTxt: true, // (optional)
sitemapSize: 7000
}
In your package.json
add postbuild script which will be automatically triggered after succesfull build where we will start next-sitemap
command.
{
"build": "next build",
"postbuild": "next-sitemap"
}
Output
After build is done you will have generated sitemap.xml and sitemap-0.xml at public
folder which means that it is static directory and everything inside of public
will be exposed in root domain level. This means that URL of our sitemap.xml
and robots.txt
will not be /public/sitemap.xml
, but it will be just /sitemap.xml due to files being in public folder. If you set generateRobotsTxt
to true you will get robots.txt file as well.
If check sitemap.xml
you should see:
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap><loc>https://bojanjagetic.com/sitemap-0.xml</loc></sitemap>
</sitemapindex>
As you can see there is only one location referencing to our sitemap-0.xml
. Lets open and check content of it:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:news="http://www.google.com/schemas/sitemap-news/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:mobile="http://www.google.com/schemas/sitemap-mobile/1.0" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" xmlns:video="http://www.google.com/schemas/sitemap-video/1.1">
<url><loc>https://bojanjagetic.com</loc><lastmod>2022-12-03T20:01:39.202Z</lastmod><changefreq>daily</changefreq><priority>0.7</priority></url>
<url><loc>https://bojanjagetic.com/routes/aboutme</loc><lastmod>2022-12-03T20:01:39.203Z</lastmod><changefreq>daily</changefreq><priority>0.7</priority></url>
<url><loc>https://bojanjagetic.com/routes/blog</loc><lastmod>2022-12-03T20:01:39.203Z</lastmod><changefreq>daily</changefreq><priority>0.7</priority></url>
<url><loc>https://bojanjagetic.com/post/npm-vs-yarn</loc><lastmod>2022-12-03T20:01:39.203Z</lastmod><changefreq>daily</changefreq><priority>0.7</priority></url>
<url><loc>https://bojanjagetic.com/post/programming-concepts</loc><lastmod>2022-12-03T20:01:39.203Z</lastmod><changefreq>daily</changefreq><priority>0.7</priority></url>
<url><loc>https://bojanjagetic.com/libary/crypto-scrapper</loc><lastmod>2022-12-03T20:01:39.203Z</lastmod><changefreq>daily</changefreq><priority>0.7</priority></url>
<url><loc>https://bojanjagetic.com/libary/github-card-npm-component</loc><lastmod>2022-12-03T20:01:39.203Z</lastmod><changefreq>daily</changefreq><priority>0.7</priority></url>
...
So as you can see it generated all routes that I have so Google crawler knows which resources are available.
Robots
As we mention already, robots.txt
will tell Google crawler which files and resources can be requested and location of sitemap. Content of generated robots.txt is something like following:
# *
User-agent: *
Allow: /
# Host
Host: https://bojanjagetic.com
# Sitemaps
Sitemap: https://bojanjagetic.com/sitemap.xml
Validation
When we have everything done and deployed, we can validate and double-check our past work.
For validating sitemap XML you can do it on XML-sitemap
For validating robots.txt you can use Google search robots.txt tester
Conclusion
Know that we have sitemap.xml
and robots.txt
we can know get better visibility on Google search and it will be better ranked, which means we will get more visitors.
Top comments (1)
🔥 Good One, Really Helpful
Kindly Checkout my blog too and provide feedback if possible
📍 dev.to/lovepreetsingh/what-is-dock...