Emilie Ma

Posted on Aug 12, 2020 • Edited on Mar 29, 2022 • Originally published at kewbish.github.io

Adding blog posts to your GitHub README with Python

This post is edited from my original blog post. Find it, and other posts by me on kewbish.github.io/blog.

Introduction

I'm working on the web track of CS50 at the moment, and I'm having a lot of fun, actually. In the meantime, I thought I'd take some time to investigate GitHub's new profile feature, and take a dive into GitHub Actions.

Yes, we have READMEs now.

I'm a bit late to the game, but hey, if you want these posts as I write them, here's my blog.

For a couple days, my Dev.to, Reddit, and dev Discord feeds were inundated with the shiny new GitHub profile README feature. All you need to know is that if you create a repo named your GitHub username (i.e. kewbish/kewbish) and make a README, it'll show on your profile, like so:

Hey, a cool README. ~~Yes, it's mine~~.

I'm not going to go through writing the copy / main text of the README much. After writing the first version, I started seeing lots of cool READMEs on Dev.to. Apparently, there's even an awesome list now. Why am I not surprised?

Anyhow, after reading through too many 'top 8 GitHub README' lists, I found SimonW's featured quite often, and I really liked the self-updating blog posts / TIL sections. So, I decided to implement a similar, albeit simpler version on my own README.

RSS with Hugo

Skip over this bit if you're not using Hugo - I'm just going over some changes to Hugo's default RSS that you can definitely ignore.

Hugo comes with a RSS template built in, so I had an RSS feed before I even knew I had one. However, you can also customize it just like all the other default layouts. This is the default template Hugo ships with - here are the changes I made.

Changing the description (line 18):

<description>Recent content {{ if ne  .Title  .Site.Title }}{{ with .Title }}in {{.}} {{ end }}{{ end }}on {{ .Site.Title }}</description>

This is pretty self-explanatory, just changed it to:

<description>Latest Yours, Kewbish posts</description>

Changing the date format (line 32):

<pubDate>{{ .Date.Format "Mon, 02 Jan 2006 15:04:05 -0700" | safeHTML }}</pubDate>

I prefer a cleaner date format (02 Jan 2006) instead of all this time info, so I changed this to:

<pubDate>{{ .Date.Format "02 Jan 2006" | safeHTML }}</pubDate>

Move from summary to description (line 35):

<description>{{ .Summary | html }}</description>

I wanted to use my descriptions instead of the first couple lines, so I used this:

<description>{{ .Description | html }}</description>

These are all just personal preference, but it makes the README bit a little more consistent with the actual blog.

Scripting with Python

The README update script is only 18 lines of Python, and uses the feedparser library to, well, parse the RSS feed.

Of course, let's start with installing and importing the library with pip install feedparser and:

from feedparser import parse

Next, we're going to get all our feed entries.

feed = parse("https://kewbish.github.io/blog/index.xml").entries
latest = [
    f"""- [{feed[i].title}]({feed[i].link})  \n{feed[i].description} - {feed[i].published}"""
    for i in range(3)]

feed contains all the entries of your RSS feed (you're going to want to change the URL to something other than my blog URL, obviously - try using your dev.to feed!). Then, we create a new list to store the first three entries, formatted as a two-line bullet point. The first line will have a link to the post and the title, and the second a description and publishing date. You can definitely play around with this, it's just plain markdown, and this is just how I decided to format my README.

farr = []
with open("README.md", "r", encoding='utf8') as x:
    for line in x:
        if line.strip() == "<!--bp-->":
            break
        farr.append(line)

We then open the README file and read each line into an array if it isn't this specific HTML comment. At this point, you might want to go back to your README and add the  comment at the end. (If you want it somewhere in the middle, you're going to have to modify the code by adding a new array and reading into that array after the comment is encountered, probably by setting a boolean value somewhere.)

with open("README.md", "w", encoding='utf8') as x:
    x.writelines(farr)
    x.write("<!--bp-->\n")
    li = [x.write(i + "\n") for i in latest]

And finally, we open the README, this time in write mode, and write all the lines back. Then, we rewrite our comment line, and then our latest list, which will be the list of formatted blog posts. (Again, if you want your widget somewhere in the middle of your README, you're going to have to write the new array you created after the blog post lines.)

The full script can be found on my GitHub.

You're also going to want to create a requirements.txt file with feedparser in it, so go ahead and do that.

Creating a GitHub Action

Note: SimonW's blog post was super helpful in figuring this out - much of my code was created after looking through theirs!

Now that we have our script and requirements, let's make our Action. There's a little Actions button on the main page of your repository, so click that and create a new workflow. Choose the 'by yourself' option, which will spit out a long YAML file. We're going to rewrite the file, so go ahead and delete it.

name: Add newest YK

on:
  workflow_dispatch:
  schedule:
    - cron: '0 */6 * * *'

First, we start with our Action name. Pretty self explanatory, call it whatever you want. Next, we have our on triggers. These define when our Action will run. workflow_dispatch lets me trigger one manually, and schedule uses familiar cron syntax. (In case you're wondering, this runs the Action every 6 hours. I highly recommend crontab.guru for figuring this out. GitHub does have a built-in tooltip though, so that can be helpful.)

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
    - name: Check out repo
      uses: actions/checkout@v2

Every Action also has a set of jobs that you can run on it. The build and runs-on line are pretty standard, just defining your Action to be run on the latest version of Ubuntu. Then, we have a set of steps, which are each individual tasks that can then run commands for us. Our first step will be checking out the repo. This is also pretty standard, as we just use one of GitHub's premade Actions.

- name: Set up Python
    uses: actions/setup-python@v2
    with:
    python-version: 3.8
- name: pip caches
    uses: actions/cache@v2
    with:
    path: ~/.cache/pip
    key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }}
    restore-keys: |
        ${{ runner.os }}-pip-

This part sets up Python, using another premade Action, and sets the default Python version. Next, we set up the pip cache so we won't have to download the dependencies each time. More information about this part can be found on the GitHub site.

- name: Install Python dependencies
    run: |
    python -m pip install -r requirements.txt

Here, we run one command to install the requirements from the requirements.txt file - here, just feedparser.

- name: Update README
    env:
        GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
    run: |-
    python get_post.py
    cat README.md

With this, we get the GitHub secret that auto-generated when using GitHub Actions, and then run the script we created earlier. Then, we cat this to README.md. Now, in our Action, our README will have updated (or not - more on that with the next block.)

- name: Commit and push if changed
    run: |-
        git diff
        git config --global user.email "yourskewbot@notarealdomain.com"
        git config --global user.name "YoursKewbot"
        git add -A
        git commit -m "Update blog posts" || exit 0
        git push

We diff the two files. If they've changed, then we set a configuration for our committer bot. Here, I've just set it to some random information - this is what'll end up in Git history and in GitHub's contribution bar at the top of your repo. Then, as we normally do when committing code, we add all the files, commit them, and push them back to the repository. At this point, our README will have changed live.

See the full Action on my GitHub.

Conclusion

Now, every 6 hours, our Action will run and update our profile README. Hopefully, this was a good introduction to GitHub Actions, and now, you have a shiny new updating README! This was a really fun learning experience for me as well - now, I can be part of the cool GitHub Actions-powered README squad!

What are some other creative RSS-based README's you've seen?