Docusaurus, the popular static site generator, simplifies the process of building documentation websites. Among its powerful features is the sitemap generator, a vital component for enhancing a site's search engine optimization (SEO).
This function is crucial for determining whether a specific route should be excluded from the sitemap. It checks the HTML head of a page for a meta tag with the name "robots" and the content "noindex". If found, the route is excluded from the sitemap. This ensures that pages marked as 'noindex' in their meta tags are not included in the generated sitemap.
Input Parameters: The function takes several parameters, including siteConfig (Docusaurus configuration), routesPaths (an array of route paths), head (HTML head content for each route), and options (plugin options for the sitemap).
Exclusion Logic: The function filters out routes that should not be included in the sitemap. Routes ending with
404.html are excluded, as are routes matching patterns specified in
ignorePatterns from the options. Additionally, routes with 'noindex' meta tags, identified by
isNoIndexMetaRoute, are excluded.
Sitemap Construction: This process uses a popular npm package by the obvious name of sitemap. The included routes are formatted with appropriate trailing slashes and base URLs. The function constructs a sitemap using the SitemapStream class and writes the formatted routes into the sitemap.
This sitemap generator function is typically integrated into Docusaurus build processes or plugins. It generates a structured sitemap for the entire documentation website, improving its discoverability by search engines.
Initialization: The function begins by initializing necessary variables and checking if the site's URL is provided in the configuration. If not, an error is thrown to ensure the URL is properly set.
Exclusion Checks: The function checks each route for exclusion criteria. If a route matches any exclusion condition (404 page, ignore patterns, or 'noindex' meta tag), it is skipped from the sitemap generation process.
Sitemap Generation: Valid routes are formatted, including trailing slashes and base URLs, and added to the sitemap stream. The sitemap stream is then converted to a string and returned as the final output.
Understanding the inner workings of the Docusaurus sitemap generator gives valuable insights into how SEO-friendly sitemaps are created for static websites. By examining the logic and flow of the code, this helps me understand how to integrate this logic into my own SSG application, which I will talk about in my next post!