DEV Community

Cover image for Auto Indexing of URLs via Google API
Lilou Artz
Lilou Artz

Posted on • Originally published at pillser.com

Auto Indexing of URLs via Google API

Pillser aggregates information about 18,000+ supplements from multiple sources. One of the challenges is to ensure that up to date information is surfaced to users when they search for supplements using Google. In this post, I will show you how I use Google APIs to index content immediately after it is updated.

Google

I am really not sure why this was so hard to figure out (I could not find any code examples), but Google APIs Node.js Client provides a way to interact with the Indexing API and Indexing API allows web developers to notify Google about state changes in the URLs they own.

Here is the code that I used to submit URLs using the Indexing API:

import { config } from '#app/config.server';
import { google } from 'googleapis';

const submitToGoogleSearchConsole = async (url: string) => {
  const auth = new google.auth.JWT(
    config.GOOGLE_CLOUD_CLIENT_EMAIL,
    undefined,
    config.GOOGLE_CLOUD_PRIVATE_KEY,
    ['https://www.googleapis.com/auth/indexing'],
    undefined,
  );

  await auth.authorize();

  const indexing = google.indexing({
    auth,
    version: 'v3',
  });

  await indexing.urlNotifications.publish({
    requestBody: {
      type: 'URL_UPDATED',
      url,
    },
  });
};
Enter fullscreen mode Exit fullscreen mode

I call this function whenever I update a supplement product or when someone asks a public question about supplements.

Acquiring Credentials

The code provided above uses a service account for authentication with the Google API. You will need to acquire credentials for the service account. Here is the documentation.

Once you have the service account, you will need to add the email address of the service account to the Google Search Console as an owner of the site.

What can be indexed?

Google Indexing API documentation has a note about what can be indexed:

The Indexing API allows any site owner to directly notify Google when pages are added or removed. This allows Google to schedule pages for a fresh crawl, which can lead to higher quality user traffic. Currently, the Indexing API can only be used to crawl pages with either JobPosting or BroadcastEvent embedded in a VideoObject.

I only saw this note after I had already implemented the Indexing API. However, despite the note, I discovered that the content I submitted to the Indexing API was indexed by Google almost immediately, i.e. it appears like the API can be used to index any content despite the note.

Bonus: IndexNow

I am also calling IndexNow API. Here is what it is:

IndexNow is an easy way for websites owners to instantly inform search engines about latest content changes on their website. In its simplest form, IndexNow is a simple ping so that search engines know that a URL and its content has been added, updated, or deleted, allowing search engines to quickly reflect this change in their search results.

Submitting your URLs to IndexNow is going to inform search engines like Microsoft Bing, Naver, Seznam.cz, Yandex, and Yep about the latest content changes on your website. However, it is not going to inform Google Search. Therefore, you still need to use the Indexing API to notify Google about the changes.

import { httpClient } from '#app/services/httpClient.server';

// generate a key at https://www.bing.com/indexnow/getstarted
// A ${key}.txt file is also placed in the public/ folder.
const key = '...';

export const submitUrlsToIndexNow = async (urls: string[]) => {
  await httpClient('https://api.indexnow.org/IndexNow', {
    headers: {
      'Content-Type': 'application/json',
    },
    json: {
      host: 'pillser.com',
      key,
      keyLocation: `https://pillser.com/${key}.txt`,
      urlList: urls,
    },
    method: 'POST',
  });
};
Enter fullscreen mode Exit fullscreen mode

Tracking Indexing Progress

I use Google Search Console to monitor the progress of indexing.

Top comments (0)