Vasyl Zubach

Posted on Apr 13, 2021 • Originally published at zubach.com

Automate Open Graph image creation

#javascript #puppeteer

Originally posted on my personal website https://www.zubach.com/blog/automate-open-graph-image-creation

If you are a developer, you've probably seen the Open Graph images (part of Open Graph Protocol) generated by popular dev related websites like DEV.to or even Vercel's Open Graph Image as a Service. Both examples are using an approach to render image related to the content, so it contains some standard layout background, an image in it that is related to the content (vercel's logo or author's avatar), headline or title of the article, and a description.

Here's how the Open Graph image looks like for my "10 Phone skins in CSS" article on DEV.to:

Both of those are very nice approaches and requires a little preparation for every website to make those images unique and their own. I wanted to generalise the idea a bit like a quick solution, or a first step, in case you need to add og:images quickly, and almost no cost in time.

The problem

While I consider this approach actually an ideal and the way it should be done, there are few things that could be considered for improvement:

requires additional design + development approach to make it look as needed
autogeneration of OG images for any kind of pages, not only the blogposts kind

How about solving these problems with a more generic approach that would suit all the needs?

Generic solution

One of my wife's favourite sayings is "Start where you are, use what you have, do what you can" by Arthur Ashe. So let's start with what we already have for all the pages we want to generate the Open Graph image - already designed and implemented web page that we can load;

As we already have the page, let's just create an API that will return a screenshot of it of a specific size. Naturally, the most important information should be on that first viewable screen. Puppeteer would be the go-to tool for that kind of work.

Puppeteer is a Node library which provides a high-level API to control Chrome or Chromium over the DevTools Protocol. Puppeteer runs headless by default, but can be configured to run full (non-headless) Chrome or Chromium.

So, we want to create an API that will:

grab whatever URL we provide;
load that URL via Puppeteer and return an image for it;

In the example below I used the approach that works for Vercel, but you should be able to use it for any kind of node.js backend or deploy this approach as a microservice using Vercel and proxy to it from your backend.

For simplicity (and limitation to the size of the Serverless Function API in Vercel), let's create a separate project/microservice that will take care of OG image generation. It'll be just a subdomain, that will our main project in terms of URL, but return images instead of HTML. So, if we have our https://example.com/<URL> website URL, the Open Graph image URL gonna be https://og-image.example.com/<URL> (same URL, but on og-image subdomain)

Step 1

According to Vercel's Serverless Functions guide, let's create api/index.js:

const puppeteer = require('puppeteer-core');
// a chrome we need for Serverless Function API to use by puppeteer
const chrome = require('chrome-aws-lambda');
const {
  NODE_ENV = 'production', // needed to be able to run local chromium to test how everything works locally
  WEBSITE // This is your main website URL
} = process.env;
// helper function just in case to give a page some time to render things after loading
const delay = (ms) => new Promise((resolve) => setTimeout(resolve, ms));
module.exports = async (req, res) => {
  const websiteURL = req.url; // FYI, it starts with `/`
  const fullUrl = `${WEBSITE}${websiteURL}`;
  const config =
    NODE_ENV === 'production'
      ? {
          args: chrome.args,
          executablePath: await chrome.executablePath,
          headless: chrome.headless
        }
      : {
          executablePath: '/opt/homebrew/bin/chromium' // based on `which chromium` command, I installed mine via homebrew
        };
  const browser = await puppeteer.launch(config);
  const page = await browser.newPage();
  await page.setViewport({
    width: 1000,
    height: 800
  });
  await page.goto(fullUrl, { waitUntil: 'networkidle0' });
  // A bit of delay to make sure page is fully settled
  await delay(50);
  const file = await page.screenshot({
    type: 'jpeg',
    quality: 81,
    fullPage: false
  });
  await browser.close();
  res.statusCode = 200;
  res.setHeader('Cache-Control', 's-maxage=300, stale-while-revalidate');
  res.setHeader('Content-Type', `image/jpeg`);
  res.end(file);
};

As this is a Vercel API, we need to route all requests of our microservice to it by providing vercel.json configuration:

{
  "version": 2,
  "routes": [{ "src": "/.*", "dest": "/api/index.js" }]
}

And that is it. This API will load the page, wait for all requests to finish (by { waitUntil: 'networkidle0' } instruction to puppeteer) and will pass along the screenshot of the size and quality we need as a response.

After this one step, you can already use your automatically generated Open Graph image as:

<meta
  property="og:image"
  content="https://og-image.example.com/whatever/url/you-wanna?pass=here"
/>

Step 2 (optional)

There's a little more we can do to improve this API performance. We know what services we are using, that might do the network calls, but they aren't that important to the outcome:

some analytics
tracking pixels
social buttons
comments service
anything that if loading potentially at the very bottom of the page
❗️and the most important - the self-reference to the Open Graph image of current page (so we don't end up having an infinite loop)

So, theoretically, we could block those requests and make the loading a bit faster, as all we need is the image, not the fully functional website. Let's add some code right before opening the page line await page.goto(fullUrl, { waitUntil: 'networkidle0' }); to intercept requests and provide some guidelines to puppeteer.

// to be able to intercept the requests:
await page.setRequestInterception(true);
page.on('request', (req) => {
  // 1. Ignore requests for resources that don't produce DOM
  // (images, stylesheets, media).
  const resourceType = req.resourceType();
  const whitelist = [
    'document',
    'script',
    'xhr',
    'fetch',
    'image',
    'stylesheet',
    'font'
  ];
  if (!whitelist.includes(resourceType)) {
    return req.abort();
  }
  // 2. Don't load your analytics lib of choise requests so pageviews aren't 2x.
  const blacklist = [
    'www.google-analytics.com',
    '/gtag/js',
    'ga.js',
    'analytics.js',
    'disqus.com',
    `og-image${websiteURL}` // self-reference I mentioned above
    // add more domains to ignore here
  ];
  if (blacklist.find((regex) => reqUrl.match(regex))) {
    return req.abort();
  }
  // 3. Pass through all other requests.
  req.continue();
});

Step 3 (optional)

In order to use the same domain for OG images, I used a config in vercel.json to route internal routes like https://example.com/og-image/<any-url> to my og-image microservice:

{
  "version": 2,
  "routes": [
    { "handle": "filesystem" },
    // This one
    {
      "src": "/og-image/(?<path>.*)",
      "dest": "https://og-image.example.com/$path"
    }
    // ... other routes config goes here
  ]
}

{ "handle": "filesystem" } config is specifically at the top, to handle the case where og images can be provided as files right away. If that's not your case - feel free to move the config for og-image route to the top

Ways to improve/expand it

There are definitely ways to improve and expand it. A very few that comes to mind are:

Combination of generic, and dedicated OG images

For the blog posts, the look of OG image that DEV.to has is great. So, we could have this generic approach in place, as well as create a simple page that we should take screenshot for blosposes. Let's say we have blogpost URLs like https://example.com/blog/url. Generic URL for making screenshot for it via microservice would be: https://example.com/og-image/blog/url, but we could create specific tiny pages for articles like https://example.com/preview/blog/url that would output exactly what we need to see on the OG image, but as a little webpage for it.

This approach could be used for anything really: blogposts, reviews page, about pages, etc... Main idea is to gradually keep making those pages for smaller previews under similar URL location (like https://example.com/preview/<any-url-here>), and then add /og-image/ prefix to those URLs to now use our microservice that generates the images from those previews.

Accept configuration

A great improvement would be to provide some configs right from the URL by GET params instead of some env variables. Like: ?_w=1000&_h=800&_q=81&_t=jpeg (_w for width, _h - height, _q - quality, _t - type). There is a possibility to overlap with some of the actual GET parameters in the URL, so here I used _ to make it more unique, and "private" in the JavaScript sense.

The reason this would be a great improvement is because there could be multiple OG images <meta /> tags on the page, with different sizes for different purposes. As different social networks that are using those could use different sizes for their needs.

Demo

Here's how this blogpost Open Graph image looks like on my website:

and the fact that this image is loaded, means that self-reference fix we did in "Step 2" works

DEV Community

Automate Open Graph image creation

The problem

Generic solution

Step 1

Step 2 (optional)

Step 3 (optional)

Ways to improve/expand it

Combination of generic, and dedicated OG images

Accept configuration

Demo

Links to useful services and tools

Top comments (0)

Read next

Your coding year in review

How to Uncheck All Your Twitter (X) Interests in Bulk with Developer Tools (Quick Method)

Dirty Code: Simple Rules to Avoid It

Angular vs Next.js: A Detailed Comparison