DEV Community

Attaching notes to git branches

Riccardo Bernardini on August 08, 2024

The problem Beside my coding activity for work, I have several personal projects active. Clearly, I work on them in my free time and si...

Read full post

Lars Kellogg-Stedman • Aug 12

Instead of storing the branch notes as files in the working directory, which clutters up the project history with changes to the notes, why not store the notes in git? Utilizing git hash-object, git mktree, git commit-tree, and git update-ref you can manipulate git references without ever having to touch files in the working directory. With appropriate changes to .git/config, you can push custom refs to git{hub,lab,etc}.

James Nesta • Aug 14

I think that there are tradeoffs to that solution. When things go wrong, debugging arbitrary git references seems painful. Using files makes the solution more transparent. KISS

Lars Kellogg-Stedman • Aug 14

I don't think working with git references is particularly difficult (we're doing that anyway!), and I think it's a huge win over cluttering your repository history with metadata updates to your notes. I think I'm going to put together a proof of concept implementation.

Mahesh Sankhala • Aug 9

This is really good solution. I wish this can become part of git itself. Can you consider opening a feature request on github for git and suggest this solution. The maintainer of git may consider making it part of git itself.

Riccardo Bernardini • Aug 9

I did not consider that, but I can try.

Bill Raiford • Aug 12

Another direction to take might be to recognize this as one of many opportunities for a new version control system that is entirely DLT. Commit messages, object descriptions/annotations are all the same thing: text attached to timelines or individual points in time. Even issues, discussions, PR messages (and more!) are all text objects that could live as first class members of the DAG instead of out-of-band in multitudes of formats haphazardly in multiple libraries. This explosion of complexity goes away if you replace git with a more generalized timeline "semantic version control" system. It is after all 20 years old and ripe for opportunities.

Mangabo Kolawole • Aug 12

That's an interesting comment and an interesting idea. But I am wondering, if we can build such a tool, won't it be heavy?

Bill Raiford • Aug 12

No heavier than any existing DLT approach. The differentiating factor is that while being a crypto protocol, it is not built from the bitcoin POV. The goal is to provide timeline dynamics to any semantic data, and the initial complexity is outweighed by all of the very interesting properties that arise later on: replication (similar to fast forward- only git merges), streamlined crypto guarantees on data integrity, auditability, and similar to git, it enables sovereign boundaries between closed and open "source" just like you can interop between closed and open repos.

Actually that's probably the most important consequence of a non-bitcoin derived DLT: the interop novelty. Instead of having a data silo exposed via an"off chain" shape via an API, you have a time-based DSL (like git) that provides clone, merge, fork, etc. operations. This means you can clone a src graph (like current git) but also you can get versioned, "on chain" projections that include other aspects: comments/text metadata like discussed in your article but also "commit messages" (an arbitrary silo to be sure), issues, discussions, and more, but also swagger-like schema metadata. This metadata plus data all in the same hypergraph is what enables the amazing properties!

Maxime Gélinas • Aug 10

Why not just creating a pull request and add a description there?

Martin Baun • Aug 12

My question exactly...

Cody Swartz • Aug 14 • Edited

I think pull requests are a good option as some pointed out, but a "pull request" is not part of git, and is part of a vendor lock-in. I think there's value in being git-centric and not tied down to a vendor, as well as being portable.

Someone else already mentioned git reference, which I would highly recommend looking into.

On some previous projects we had tooling that would run some hooks to run tests and check if the coverage had dropped before allowing you to push up. It stored the metadata in references . There was or course overrides, but it helped save some cloud money on smaller projects that didn't have an intensive test suite. Also for some test suites you can just run tests for the given changes based upon the AST/related/touched code to help alleviate long local runs slowing down developer productivity; gotta find the right trade off/balance as usual.

Bill Raiford • Aug 17

Two things: 1) Bravo for realizing the difference between git and all the lock-in on top. 2) Thanks for liking my post. I actually already have the protocol fleshed out. If I can't get any traction though I'm gonna give it up...and this is after over a decade of work put into it! If you know anyone else who is aware of the internal structure of git and other DLT approaches and wants to produce the the thing that can eat git (and git ops), be sure to check out ibgib dot link or dot com. Pathetic I have to spam people in comments like this, but it's a very small niche that can even grok these things, let alone anyone with any vision.

Cody Swartz • Sep 27

This is the tooling I was mentioning. Drew who has been pioneering this is quite passionate. I wonder if there could be some overlap.
git-ps.sh/

Bill Raiford • Sep 27

Wow thank you for getting back on this. I read through Drew's documentation and I see what you mean about his passion. It appears he has committed a huge investment to the git protocol itself though, with his primary focus on the methodology and ultimately incremental, and extremely pragmatic results. I'm talking about a protocol that subsumes the git protocol itself entirely, kind of like a general relativity subsuming the "simpler" case of special relativity. Drew seems to have done a great job focusing on enabling users to use current tooling, whereas ibgib would obviate the need for much of that tooling, providing a completely different approach.

For example, this entire blog post is about a kluge for adding comments to git. Most existing git tooling has to either also have their own kluge or avoid the issue. Much of the code review workflows are centered around ameliorating this inefficiency.

But what if you could take any "commit" and any comment on any commit and any issue relating to any commit, and even any comment within any issue(!)...what if you could take any of those data and metadata messages and handle them and derivative (downstream) data and metadata in a uniform way AND have them live in the cryptographical verifiable "object database" (the actual git graph that now only contains primary src data, i.e. the "commits" and zero issues or peer reviews etc.). This cuts out HUGE technical debt and would enable HUGELY streamlined DevSecOps experience, in addition to enabling brand new paradigms for human-human-ai UX.

But it would take an initial investment, and I am a very good coder but it would take more eyeballs than my two 👀 to get it beyond where I've brought it to today solo. You can check out an extremely slow and ugly old web app MVP at ibgib.app or npm @ibgib/ibgib package to see where I'm at with the version control aspect. (Though just to be clear, it is a protocol not just an app...that MVP is just to give you a taste of some of the dynamics unrelated to version control).

Cody Swartz • Oct 4 • Edited

We had some internal tooling for our own review paradigm we tried out. I'm not sure that it was ever released to the public. It stored metadata the same way for each commit. There was a UI to manage it all. Our review workflow would be to mark the commit as reviewed which would contain our handle/email iirc. The UI leveraged a web pane to GitHub for leaving comments at the time although the big picture idea was to have it be platform agnostic and store it all in git.

I did quickly see that you want to lean into the branches, while he touts a non-branch approach. I'm not sure if the "patch ranges" have been implemented fully now, but those were always more analogous in my mind to branches. I've wanted to imagine them blended together, but order of commits is of course very important, but linear commit history can also help save you from some nightmares. Even with ideologies and tooling there's still plenty to yearn for from it all. Then at other times it's easy to just throw the cap in and go with the flow of current tooling. I do hope some changes eventually become more mainstream.

Edit: as you stated one can either shove their way into the current mess by adding more mess to the madness or create their own. For us it was to utilize current tooling with a superset on top, that didn't require full team buy-in.

Bill Raiford • Oct 4

Yes, I had the same exact thought on "patches" vs branches. They are really the same thing. Really it is an attempt at an implementation detail. My point is that his approach - and indeed from what I hear of your own tooling's approach - are both very practical attempts at creating a workflow on top of git. Believe me, you and Drew are not alone, because so many are trying their own patchy implementations because GitHub is so insanely popular. But git's design is a local maximum, and a dangerous one because even people who are extremely smart don't question its architecture. I highly recommend you check out Hanno Embregts talk "A Bird's-Eye View of Version Control with Pijul ". I would link directly but I don't know the policy here and I'm tired of not getting past admins on various platforms because of embedded links. You can find a direct link at ibgib.link on the first item (the exact point in time is 390s into the video). Anyway, he was a teacher of git until one of his students said "How come there's no new cell phones since 2005?". Yes, git is 20 years old and no, insane as it is, there are no real competitors entering the space. There's not any competitors THINKING about the space. It is a HUGE opportunity.

I am continually thinking of new ways my ibgib protocol apply to the trend of the moment. I predicted NFTs back before NFTs were a thing (have an LLM summarize my "Ibgib - a different approach to code and data" thread on the elixir forum (a previous implementation was in Elixir, the current is in TypeScript). I've been mulling various aspects of "micro-version control" and I realized that today's attempts at AI + IDE are all missing out because of git's technical debt. Once you streamline the timeline architecture to a more generalized system, AI + IDE becomes a natural extension. You don't fork at the "repository" level. You fork at the file level at worst. You should be able to semantically chunk code nowadays using language services to where AI would be creating "feature branch"-like micro branches where you and the AI (and other humans + AIs) are able to create and drop branches with such ease that you can do it at the function level or even possibly sub-function level. But that is more speculative. For now, the near-term approach is to subsume git's current behavior, and working alone, it's still taking me a long time. And like I said, no one even thinks about the technical debt aspect, but anyone really interested in the enormous AI future should absolutely be investing in this type of approach.

One last thing (I type too much because so few listen!), since we are able to micro-version control things, we are able to deep-link the content addressing of the created artifacts. You can basically assign a "git commit hash"-like address to any identity (like a wallet in standard crypto) and any created code (and derivative data which is mind-blowing but that's beside this point). Because of this, you are actually enabling a leaner bounty-like system. So bootstrapping the protocol makes it more efficient to continue to bootstrap the protocol. This would solve open-source remuneration, so we wouldn't need to spend billions of dollars like current vc investments. IOW with targeted investment, we could bootstrap this on next-to-no funding when compared to similar approaches (currently only IPFS + ceramic or Tim Berners-Lee's Solid Pods approaches are similar architecturally).

John McElreavey • Aug 10

Any reason why you don't just create a draft pr for the branch?

Mirko Friedenhagen • Aug 11

Creating a PR/MR relies on using GitHub, GitLab, Gitea or similar systems. Above solution works with plain git if I understand it correctly. However probably the notes become part of the git history which adds some noise not related to the source code.

IlyaUmanets • Aug 13

Well, in my opinion, it is so wrong.
If you work on a task and you need to switch, commit with a WIP message.
If you go back to your card and don’t understand why you did it, it’s fine. But if it’s not clear completely at all in 20-30 minutes, I think something wrong with your codebase, consider refactoring etc, or your task is not clear defined, ambiguous or looks mostly as an epic. I’ve been working as a software developer for 9 years on different projects, but I have never faced the same issue

Jakub Narębski • Aug 31

The repository lack information about license in README.md file, and does not include LICENSE or COPYING file with the text of the license. Could you please add it? Thanks in advance.

Oldy • Aug 14

Consider using TODO lists. They are easy to maintain and all IDEs have tool to work with them.
Rule of thumb:
Plan your work with todo
Keep your todo only in branches. Main branch must not has todos