DEV Community

Alexander Demin
Alexander Demin

Posted on

GitHub as a database

git is indeed a database. GitHub is a remote database powered by git.

I needed a way to keep information about certain important events in my code nicely saved for later analysis. What can be better than committing them to a VCS? Timestamps, commit descriptions etc.

I used local git first and then switched to GitHub. GitHub provides API for all its functionality.

The little code below demonstrates how this approach works.

It needs two things to be set: GITHUB_TOKEN, which can be generated in your GitHub account and the repo variable with the repository name.

It upserts a new file to create a file. Then it upserts it again to modify it. Then it deletes the file.

The repository log nicely keeps all these actions in the commit history.

Note: The PyGithub package needs to be installed first:

pip instal pygithub
Enter fullscreen mode Exit fullscreen mode

Code:

import os
from typing import Optional, Union

import github
from github.Repository import Repository


def get_repo(repo: str) -> Repository:
    assert repo, 'repository name is missing'
    g = github.Github(os.environ['GITHUB_TOKEN'])
    return g.get_repo(repo)


def upsert_file(
    name: str,
    body: str,
    message: Optional[str] = None,
    *,
    repo: Optional[Union[Repository, str]] = None,
    branch: Optional[str] = "main",
    verbose: Optional[bool] = False,
):
    r = repo if isinstance(repo, Repository) else get_repo(repo)
    try:
        description_ = message or f'Update {name}'
        current = r.get_contents(name, ref=branch)
        current = r.update_file(
            current.path,
            description_,
            body,
            current.sha,
            branch=branch,
        )
        if verbose:
            print(current)
    except github.GithubException:
        message = message or f'Create {name}'
        created = r.create_file(name, message, body, branch=branch)
        if verbose:
            print(created)


def delete_file(
    name: str,
    message: str = None,
    *,
    repo: Optional[Union[Repository, str]] = None,
    branch: str = "main",
    verbose: Optional[bool] = False,
):
    r = repo if isinstance(repo, Repository) else get_repo(repo)
    message = message or f'Delete {name}'
    current = r.get_contents(name, ref=branch)
    deleted = r.delete_file(
        current.path,
        message,
        current.sha,
        branch=branch,
    )
    if verbose:
        print(deleted)


assert os.getenv('GITHUB_TOKEN'), 'Set GITHUB_TOKEN variable'

repo = "<YOUR_GITHUB_NAME>/<REPO_NAME>"

upsert_file("README.md", "NEW BODY", repo=repo, verbose=True)
upsert_file("README.md", "UPDATED BODY", repo=repo, verbose=True)

delete_file("README.md", repo=repo, verbose=True)
Enter fullscreen mode Exit fullscreen mode

Execute it by:

python main.py
Enter fullscreen mode Exit fullscreen mode

It prints something like:

{'content': ContentFile(path="README.md"), 'commit': Commit(sha="a6c540fec9b1b02e21acbb0ddd790efb6b7cb33f")}
{'commit': Commit(sha="2436e7ff2692a9af398dabd9eb9d1eee0f821954"), 'content': ContentFile(path="README.md")}
{'commit': Commit(sha="31fefb51e3510071777e4f4c8a0971de0a184f78"), 'content': NotSet}
Enter fullscreen mode Exit fullscreen mode

Go to your GitHub repository and check the commits.

Top comments (1)

Collapse
 
phlash profile image
Phil Ashby

Amusing (ab)use of Github 😁

I would probably look at a source/version control system as an event store rather than a database (where I would expect to see explicit relationships between data items/tables/documents), but you certainly get atomicity, transactional processing and recovery features!