Paul Scanlon

Posted on Dec 6, 2021 • Edited on Dec 9, 2021

Building The Gatsby Changelog Prototype

#actionshackathon21 #github #opensource

Hi 👋

In this post i'll be documenting my journey as i develop a prototype Gatsby Changelog site for the GitHub Actions Hackathon.

Repo: gatsby-inc/changelog
Gatsby Changelog: changelog.gatsbyjs.io
Twitter Account: @GatsbyChangelog
Submission Blog post: The Gatsby Changelog Prototype

Day 8 - Wednesday 8th December

Twitter Feed is in! This is a bonus feature and i'm almost out of time but... i thought it might be nice to demonstrate Gatsby's Serverless Functions. Serverless Functions are written and live along side your front end code but they run on "the server". What this means for Jamstack devs is that you can write a function as you normally would and Gatsby will run it on the server. This is particulary helpful when communicating with API's that require secret API keys and will only accept server-side requests. The Twitter API is one instance where using Serverless Functions makes perfect sense!

I'm using the Twitter V2 API for Node.js to request Tweet information from the new @GatsbyChangelog Twitter account and display them on a new page on the site which can be see here: https://changelog.gatsbyjs.io/twitter-feed.

To use Serverless Fucntions all you need to do is export a default function from somewhere in the src/api dir. Here's the one i'm using in the Changelog. It's called get-tweets.js

//src/api/get-tweets

const Twitter = require('twitter-v2');

const twitter = new Twitter({
  consumer_key: process.env.TWITTER_CONSUMER_KEY,
  consumer_secret: process.env.TWITTER_CONSUMER_KEY_SECRET,
  access_token_key: process.env.TWITTER_ACCESS_TOKEN,
  access_token_secret: process.env.TWITTER_ACCESS_TOKEN_SECRET
});

export default async function hander(req, res) {
  const { id } = JSON.parse(req.body);

  try {
    if (!id) {
      res.status(400).json({ message: 'id not found' });
    }

    const { data } = await twitter.get(`users/${id}/tweets`);

    res.status(200).json({
      message: 'A ok!',
      tweets: data
    });
  } catch (error) {
    res.status(500).json({ error: 'Error' });
  }
}

... and i make the HTTP request from my React page component inside a useEffect

//src/pages/twitter-feed.js

  const [isLoading, setIsLoading] = useState(true);
  const [tweets, setTweets] = useState([]);

  useEffect(() => {
    const getTweets = async () => {
      const response = await fetch('/api/get-tweets', {
        method: 'POST',
        body: JSON.stringify({ id: '1456240477783695360' })
      });

      const data = await response.json();

      setTweets(data.tweets);
      setIsLoading(false);
    };

    getTweets();
  }, []);

When the response comes back i set the returned data into React's state using setTweets and then iterate over the response in Jsx.

You can see the function src code here: get-tweets.js,
and the React component src code here: twitter-feed.js

So there you have it. SSG for data that doesn't change a lot but is kept fresh using a GitHub Action, SSR for data that changes frequently and is statically baked into a page and Serverless Functions for real time data requests.

I'm pretty sure that no matter what you're trying to build you can achieve it using Gatsby and Gatsby Cloud.

Day 7 - Tuesday 7th December

Events are in! I wasn't sure if i could pull this off in time but decided i'd give it a go and see what happens. I wanted to demonstrate where using SSR rather than SSG is a good idea and Repository Events seemed like a good fit.

Where SSR differs to SSG is when data is requested each and every time a users visits a page. The data is still statically baked into the page so SEO doesn't suffer as much as it would if you were to make a typical client side request (CSR). However like with CSR, SSR will exhibit is a slight delay before the page loads.

In my opinion compared to SSG, SSR is the second best option.

That said i feel SSR is the right method for dealing with data that frequently change, like GitHub Repository Events.

By contrast SSG and the Cron Job GitHub Action is the best way to deal with data that updates less frequently.

To request "events" i've opted for using the GitHub REST API. I had a play with the GraphQL API for this but as with SSR vs SSG, GraphQL compared to REST also has its advantages and disadvantages. The REST API does a lovely job of returning a whole lump of data which is perfect for the Events page. If i'd tried to do the same with the GraphQL API i would have had to create quite a specific query to pull all the different event types. Pull Requests, Issues, Watch Events etc.

Here's a snippet of the REST request i'm using.

  const { data } = await octokit.request(
    `GET /repos/{owner}/{repo}`,
    {
      owner: 'gatsbyjs',
      repo: 'gatsby'
    }
  );

I've also added an intro section to the site which explains a little more about the project and links to this blog post. The "content" for the intro section is sourced from a local markdown file and i've created a new node in the Gatsby Data layer called "intro".

I think i'm pretty much there with this project now... and think i'm ready to submit my entry for the Hackathon -- Wish me luck!

Day 6 - Monday 6th December

I've had an absolute belter of a day.

I have now implemented my GitHub Action. It's effectively a Cron Job which runs every 12 hours. 🥳

... and here it is

name: Build Site

on:
  schedule:
    - cron: '* */12 * * *'

jobs:
  Trigger-Build:
    runs-on: ubuntu-latest
    steps:
      - run: echo "set env using GATSBY_CLOUD_BUILD_WEBOOK"
        env:
          GATSBY_WEBHOOK: ${{secrets.GATSBY_CLOUD_BUILD_WEBOOK}}
      - name: Post to Gatsby Cloud Webhook
        id: rebuild
        uses: fjogeleit/http-request-action@master
        with:
          url: ${{secrets.GATSBY_CLOUD_BUILD_WEBOOK}}
          method: 'POST'

When the schedule kicks in i call a Gatsby Cloud webhook url which is stored as a GitHub Secret. This webhook URL notifies Gatsby Cloud to give off a new build. When a new build is started the data i'm querying using the GitHub GraphQL API is returned and pumped into Gatsby's data layer. If no new date is found Gatsby Cloud is smart enough to know it can use a cached build rather than start a fresh one so builds that contain existing data are super speedy.

Now the Cron Job is running this cycle will repeat from now... until the end of time!

I think this is a great way to ensure that a static websites can remain "fresh" and so long as the data you're requesting doesn't change too often, running a Cron Job every 12 hours should be sufficient.

Markdown update!! After trying and failing to work out how to deal with this pesky frontmatter issue i dropped a note into the Gatsby Engineering Slack Channel and within seconds Kyle Mathews came to my aid and suggested i try gray-matter... and as luck would have it, it worked a treat!

To recap, the problem i was having was that the .md files i'm sourcing from GitHub contain frontmatter. You can see one of them here: v4.3.

I wanted to do two things:

remove the frontmatter from the main "body" of the markdown and create a new html object on the node
transform and isolate just the frontmatter and create a new frontmatter object on the node

Here's the two functions i'm using to achieve this

transform frontmatter

//gatsby-node.js
const matter = require('gray-matter');

const transformFrontmatter = (markdown) => {
  const grayMatter = matter(markdown);

  return grayMatter.data;
};

Convert markdown to HTML

// gatsby-node.js

const remark = require('remark');
const remarkParse = require('remark-parse');
const remarkRehype = require('remark-rehype');
const rehypeStringify = require('rehype-stringify');
const rehypeAutoLinkHeadings = require('rehype-autolink-headings');
const rehypeSlug = require('rehype-slug');
const matter = require('gray-matter');

const convertToHTML = async (markdown) => {
  const grayMatter = matter(markdown);

  const response = await remark()
    .use(remarkParse)
    .use(remarkRehype)
    .use(rehypeSlug)
    .use(rehypeAutoLinkHeadings)
    .use(rehypeStringify)
    .process(grayMatter.content);

  return String(response);
};

I can then use both of these functions when i create the new node

    createNode({
      id: object.id,
      index: index,
      name: name,
      frontmatter: await transformFrontmatter(object.text),
      html: await convertToHTML(object.text),
      internal: {
        type: 'changelog',
        contentDigest: createContentDigest(data)
      }
    });

It's a bit tricky to explain without showing the output but rest assured it does what i needed.

I've also spent some time today tidying up the CSS so the site looks a little nicer and refactored the jump links.

I actually needed to use two additional rehype plugins to achieve this. rehypeSlug and rehypeAutoLinkHeadings this adds the name of the heading to an href so that i can "jump" to any heading contained within the markdown file.

The onRouteUpdate now looks like this

// gatsby-browser.js

export const onRouteUpdate = ({ location }) => {
  const heading = document.getElementById(location.hash.split('#')[1]);
  const nav = document.querySelector(`header`);

  if (heading) {
    window.scrollTo({
      top: heading.offsetTop - nav.offsetHeight - 12
    });
  }
  return true;
};

Oh and I added a link to the @GatsbyChangelog Twitter account and did a little bit of work around SEO / Opengraph data

... and that just about wraps things up for today.

Day 5 - Sunday 5th December

Had a better day today. I tried again to play with the frontmatter but still haven't quite worked out what's going on so shifted my focus to some CSS. I have a nice looking site now, the one issue i did encounter is adding the hash links to the sidebar because what i want to happen is when any nav link in the sidebar is clicked the page will scroll to the heading for that release... but i also want each heading to work the same way. I also want to highlight the current nav link in the sidebar.

This is probably a post on its own but here's the main bit that allows a page to "jump" to a heading.

// gatsby-browser.js

export const onRouteUpdate = ({ location }) => {
  const jumplink = document.querySelectorAll(
    `a.jumplink[href="/${location.hash}"]`
  )[0];

  if (jumplink) {
    window.scrollTo({
      top: jumplink.offsetTop
    });
  }
  return true;
};

this "jumps" to the offsetTop of any HTML <a> that has the class name jumplink and of course, the correct location.hash E.g


<a class="jumplink" href="/#v4.2">
   <span class="text-lg text-gray-300">#</span>v4.2
</a>

I've got a little more CSS to do today because i'll need a burger menu so that sidebar can open and close and i'll probably need to adjust some font sizes so the site is usable on "mobile"

Day 4 - Saturday 4th December

Not a great day today, i was partly walking around NYC trying to find a coffee shop so i could sit down and work out a problem i've been having with the frontmatter contained within each release note.

Each release note uses frontmatter which contains a date, version number and title...

---
date: "2021-11-30"
version: "4.3.0"
title: "v4.3 Release Notes"
---

The trouble is when i use use remark-rehype each item from the frontmatter gets converted to either a <p> or <h2> and really i want to display this information the same way GitHub does, e.g in an HTML table. I haven't solved this yet so i'll pop it in the backlog and come back to it later.

Day 3 - Friday 3rd December

Today was mainly having a think and trying to workout a few things. It had previously occurred to me that using gatsby-transformer-remark to transform remotely sourced markdown files might be "taking the long route"... and it turns out it is.

The reason is that gatsby-tranfromer-remark is great for transforming files or "nodes" that already exists on disk. But, in my case i'm souring markdown from a remote source. In order for the plugin to do it's thing i have to make GraphQL understand the markdown files are of type "File"

All of this is completely unnecessary if i transform the markdown nodes to html before i add them to Gatsby's data layer. To achieve this i'll use the same remark transformers the plugin uses. The only slight snag is that a lot of the remark node modules are ESM only and i need to use them in gatsy-node which at the time of writing only compiles common.js

It's kinda ok for now because i've rolled back the versions of remark i'm using which will work as cjs. You can see where the updates to ESM happened by inspecting the commits on each of the repos.

Here's the full gatsby-node.js file showing how i'm currently sourcing remote markdown files and adding them to Gatsby data layer.

gatsby-node.js

const { graphql } = require('@octokit/graphql');
const remark = require('remark');
const remarkParse = require('remark-parse');
const remarkRehype = require('remark-rehype');
const rehypeStringify = require('rehype-stringify');

const graphqlWithAuth = graphql.defaults({
  headers: {
    authorization: `token ${process.env.OCTOKIT_PERSONAL_ACCESS_TOKEN}`
  }
});

const CHANGELOG = 'changelog';

const convertToHTML = async (string) => {
  const response = await remark()
    .use(remarkParse)
    .use(remarkRehype)
    .use(rehypeStringify)
    .process(string);

  return String(response);
};

exports.sourceNodes = async ({
  actions: { createNode },
  createContentDigest
}) => {
  const {
    repository: {
      folder: { entries }
    }
  } = await graphqlWithAuth(`
  query {
    repository(name: "gatsby", owner: "gatsbyjs") {
      folder: object(expression: "master:docs/docs/reference/release-notes") {
        ... on Tree {
          entries {
            name
            repository {
              createdAt
            }
            object {
              ... on Tree {
                entries {
                  name
                  object {
                    ... on Blob {
                      id
                      text
                    }
                  }
                }
              }
            }
            object {
              ... on Blob {
                id
                text
              }
            }
          }
        }
      }
    }
  }
  `);

  const createMarkdownNode = async (data, name, repository, index) => {
    const { object } = data;

    createNode({
      id: object.id,
      index: index,
      name: name,
      date: repository.createdAt,
      html: await convertToHTML(object.text), // this was markdown and is now html
      internal: {
        mediaType: 'text/markdown',
        type: CHANGELOG,
        contentDigest: createContentDigest(data)
      }
    });
  };

  entries.forEach((item, index) => {
    const { object, name, repository } = item;
    if (Array.isArray(object.entries)) {
      const markdown = object.entries.find((item) => item.name === 'index.md');
      createMarkdownNode(markdown, name, repository, index);
    } else {
      createMarkdownNode(item, name, repository, index);
    }
  });
};

Day 2 - Thursday 2nd December

I had a few hours to spare today to investigate how to query GitHub using the GraphQL API. I had a small problem with some of the release-notes markdown files. Specifically ones nested in sub directories. Over time we've moved each .md into it's own directory and in order to query both .md at the root of the release-notes dir and sub directories I needed to get a bit creative with GraphQL.

Here's the query i've settled on. I found it best to try out the query using GitHub's GraphiQL Explorer before implementing it into my project.

  query {
    repository(name: "gatsby", owner: "gatsbyjs") {
      folder: object(expression: "master:docs/docs/reference/release-notes/") {
        ... on Tree {
          entries {
            name
            repository {
              createdAt
            }
            object {
              ... on Tree {
                entries {
                  name
                  object {
                    ... on Blob {
                      id
                      text
                    }
                  }
                }
              }
            }
            object {
              ... on Blob {
                id
                text
              }
            }
          }
        }
      }
    }
  }

Now that I can query the data I can iterate over the response and using createNode add the markdown / HTML content to Gatsby's Data layer.

Adding the response data to Gatsby's data layer went ok, the difficulty was converting the sourced nodes which are of type markdown and transforming them to HTML using gatsby-transformer-remark. Usually if you have .md files on disk the plugin takes over and transforms the files nodes to HTML, however in my case the response which is of type markdown doesn't yet exist as a file node, and if it doesn't exist as a file node gatsby-transformer-remark can't do it's job.

So there was a quite a bit of work then went into creating the nodes on disk and then ensuring they are of the correct type using Gatsby's createSchemaCustomization.

Here's how I create the nodes

// gatsby-node.js

exports.onCreateNode = async ({
  node,
  actions: { createNode, createNodeField },
  createNodeId,
  createContentDigest
}) => {
  if (node.internal.type === CHANGELOG) {
    const markdownNode = {
      id: createNodeId(node.id),
      parent: node.id,
      internal: {
        mediaType: 'text/markdown',
        type: 'markdown',
        content: node.object.text,
        contentDigest: createContentDigest(node)
      }
    };

    createNode(markdownNode);

    if (markdownNode) {
      createNodeField({ node, name: 'text', value: markdownNode.id });
    }
  }
};

... and here's how i ensure they are of the correct type

// gatsby-node.js

exports.createSchemaCustomization = ({ actions: { createTypes } }) => {
  createTypes(`
    type changelog implements Node {
      text: markdown @link(from: "fields.text")
    }
  `);
};

This is all i need for now, and querying the data from the page component can now be done as i normally would, and here's what that looks like.

// index.js

import React from 'react';
import { graphql } from 'gatsby';

const Page = ({ data }) => {
  const {
    allChangelog: { nodes }
  } = data;

  return (
    <main className="container mx-auto max-w-5xl grid gap-16 p-8">
      {nodes.map((node, index) => {
        const {
          name,
          repository: { createdAt },
          text: {
            childMarkdownRemark: { html }
          }
        } = node;
        return (
          <div>
            <div className="text-brand-primary text-5xl font-black">{name}</div>
            <div
              className="prose lg:prose-xl"
              dangerouslySetInnerHTML={{ __html: html }}
            />
          </div>
        );
      })}
    </main>
  );
};

export const query = graphql`
  {
    allChangelog(sort: { fields: repository___createdAt, order: DESC }) {
      nodes {
        name
        repository {
          createdAt
        }
        text {
          childMarkdownRemark {
            html
          }
        }
      }
    }
  }
`;

export default Page;

The class names on the HTML elements are from Tailwind, and if you're looking to add TailwindCSS to your Gatsby site there's an excellent guide in the TailwindCSS Docs

Day 1 - Wednesday 1st December

I'm so late to the party but I saw this Tweet from Colby Fayock and thought... ah, ha. I wonder if I can cobble something together to demonstrate a Cron Job GitHub Action i've used on a number of projects for the GitHub Actions Hackathon.

I also wondered what kind of project I could build that would best demonstrate why a Cron Job is quite handy as a GitHub Action.

Then it struck me. We (at Gatsby) have launched a new Twitter Account. It's called GatsbyChangelog and we plan to use it to announce the changes to the Gatsby framework every couple of weeks. Along with the new Twitter Account i'll also be prototyping a Changelog site: https://changelog.gatsbyjs.io. We have plans to bring more content to support the descriptions contained within the Changelog, but we need a home for this content.

I plan to build a Gatsby site, hosted on Gatsby Cloud that will source markdown files from the Gatsby GitHub Account docs folder. The markdown files within the release-notes will be sourced and then rendered to a single page. I'll be using @octokit/graphql and a few of Gatsby's data handling methods: sourceNodes and createNode.

The Cron Job Action is kinda crucial for this project because the site will be statically rendered, but in order to ensure the data sourced from the release-notes is always "fresh" I'll use a Cron Job to call a webhook provided by Gatsby Cloud which will re-build the site every 12 hours. I could use Gatsby's newest page rendering method SSR to achieve the same thing but this approach it is way more fun!

DEV Community

Building The Gatsby Changelog Prototype

Hi 👋

Day 8 - Wednesday 8th December

Day 7 - Tuesday 7th December

Day 6 - Monday 6th December

Day 5 - Sunday 5th December

Day 4 - Saturday 4th December

Day 3 - Friday 3rd December

Day 2 - Thursday 2nd December

Day 1 - Wednesday 1st December

Top comments (0)

Read next

Analysis of the Compose Configuration File for SafeLine WAF - MGT

Open Source IDS/IPS Suricata for Beginners

Transform Your Terminal with eza: The Upgrade ls Deserved

My First Publish to crates.io (and cross compilation)