loading...

A journey of self-discovery... and also how I learned how to use Git

grmnsort profile image Germán Rodríguez Ortiz ・1 min read

For my very first post I wanted to tell a story about how I learned to use and love Git, but I'd like to start much earlier on, when I was just learning about VCS tools.

While studying I was exposed to SVN as my default VCS, we struggled with a couple of friends trying to understand its mysteries while developing the prototype of some forgotten web application.

Our young and unworthy minds could barely grasp the very concept of trunks, branches, commits, and everything else in between. We would finally set on a single trunk and issue related branches to separate our work.

Actual picture of me and my friends preparing to merge some feature into the master trunk

Actual picture of me and my friends preparing to merge some feature into the master trunk - circa 2008

But our poor understanding led us to make several mistakes when merging our work, with a mixed set of problems, varying from methods getting scrambled, to complete classes being overwritten. Many "accidents" later we gave up on the idea of a VCS, we weren't capable of both handling its components and understanding how did that tool fit into our workflow. Thus I gave up entirely, that early on, on even using such tools.

So I kept to my studies and from time to time developing, somehow, without any form of VCS, until I landed my first serious job (and to the date of this article, my current one). It was December 2013, and I was to join a small team of developers that had been using Git for just over a year. I was blown away when they explained to me what it was, how they used it and how simple it felt over my previous tragic experience with SVN.

The inside of my head as I realized for how long I had lived a sinful life without using a VCS tool

Back then the development workflow was centered around two main Git branches: master that hosted all of the verified and stable code deployed to our production environment and test, a branch that contained all the code that had to be reviewed by our QA team, this branch was deployed to a special "testing" environment. Also, each developer would create a feature/bugfix branch to solve particular issues, these branches were to be deployed on local instances of the system for every developer. Once each developer finished their work they would immediately merge it into the test branch.

At certain times during the day, our head developer would deploy the contents of the test branch into the testing environment. Once the testing was done and all of the content of the test branch was considered as "approved", then our head developer would merge the test branch into our master branch and proceed to deploy all of its contents into the production environment.

Our fearless head developer waiting to merge the test branch into the master branch

Please consider that at that time I thought that was the most awesome thing I had ever seen ...

This workflow had many problems, among them:

  • No feature or bugfix branch was updated with the changes added to the master branch in between deploys. This caused massive conflicts for older features that sometimes implied making the whole thing again.
  • All merging was done with fast-forward (at that time we had no idea what that meant) making the logs barely readable for tracking groups of changes made to the system.
  • To accomplish the production deploy our head developer would have all of us stop merging our features and fixtures into the test branch until he was done. Occasionally a developer would forget this and merge something into it causing what in my country slang we would call a "midfield goal" (mandatory soccer reference from a Chilean ... check) effectively deploying code straight into production without anyone from the QA team giving it a simple AOK.

As a graph, it looked something like this:

On this graph new commits are closer to the right

Somehow we managed to grind through with this workflow up to mid 2014 when we finally hit critical mass, the poorly solved conflicts and the many "midfield goals" placed us on a really bad place, missing deadlines and not being totally capable of guaranteeing stable-ish code on our production environment, we needed a change and we needed it fast.

Enter Gitflow, descending upon us from the hands of our fearless head developer, a "new-to-us" way to organize our work and reclaim control of the mess we had on the master branch. We were to add the development branch to separate the lifecycles of our feature and bugfix branches and use the release branches to gradually add features to the system in a controlled and, hopefully, planned manner. Tags were to be used not only as flags to indicate the given status of the application in the git history but also as specific points to perform deployments to the production environment.

How we saw our head developer, as he told us about GitFlow

We had the added challenge that our test branch had to stay, because it was still the only way we had to deploy code so it could be tested by our QA team, so in practice, we ended up with two Gitflows, one for test branch and one for the master branch. Bugfixes and releases would get merged into either branch given its current status.

As a graph this new workflow looked like this:

On this graph new commits are closer to the bottom

We were also told that we would no longer have to wait to deploy merged code to the test branch for bugfixes. If a developer finished, in addition to from immediately merging the branch into the test branch they would also deploy that code into the test environment.

And for a while this was ok.

You see, we thought that just by changing the branching strategy would solve all our problems, when we should have also reviewed our handling of the branches, specifically:

  • We still weren't properly updating old branches.
  • We still weren't sure what this "fast-forward" strategy was.
  • Conflicts were resolved without really paying much attention to what was really happening. At that time we weren't using diff tools to resolve merges, and a popular strategy in the office was just accepting the changes currently being merged into the target branch as the "correct" ones, instead of having to read and manually accepting either set of changes.
  • The test branch started to "clutter" with all the issue related branches we were constantly merging, causing many unnecessary conflicts and false positives on issues that weren't really solved.

17 releases ... that was all it took us to crash and burn again. A year and a half had almost passed since we adopted this modified Gitflow and that last release took at least a week and a half to fully solve every single conflict it had.

average production deploy during our GitFlow period

We had to fix this mess and we weren´t sure what to do, we started by dropping GitFlow altogether, going back to our "2 main branch workflow", and slowly trying to identify what went wrong. From there we started addressing each problem detected, one at a time:

  • We learned about the different strategies to keep our development up to date.
  • We established a proper channel to discuss the resolution of conflicts.
  • We agreed to use non fast-forward merges so our code would be easily grouped an read.

With that in mind, we designed a workflow model, based on both our past experiences and things that we found from articles online, that could be adapted to our learning process and to the reality of our development process lifecycle.

We ended up setting on a workflow that required:

  • That each branch that hasn't been merged into master must be updated relative to the last pushed tag.
  • The branch update strategy was through rebasing.
  • Any merge done into the master or test branches is done in "non-fast-forward" format, leaving an additional commit, properly identifying when an issue related branch was merged.
  • The test branch must be as close as possible in content to the latest pushed tag on master.

We also dropped the immediate test merge and deploy of any issue related branches and opted for a "wait until we really need this on the test environment" approach.

Also, we adopted a procedure where whenever we push a new tag to master, we would delete the test branch and started a new one by branching from the newly pushed tag.

As a graph our new workflow looked like this:

On this graph new commits are closer to the bottom

With these changes we managed to:

  • Almost completely eliminate big conflicts on either main branch. And reduce small conflicts from a "daily frequency on the majority of the issue related branches" to "a once a month on a couple issue related branches".

  • Create a simple and readable git history, where all the work performed for each issue could be easily identifiable.

  • Almost completely eliminate false positives during the testing phase of the issues.

We've continued to tweak this model up to this very day, the last thing we've done is introduce Gitlab to the mix hoping to use Merge Requests and CI/CD in the near future. We hope to continue to grow and establish a more robust workflow with each passing day.

As I look back on this story, I almost can't believe how much has passed and how much we had to work and learn to get to where we are. I'll try not to forget this as I continue to learn.

Thanks for taking the time to read this :)

Posted on Nov 26 '18 by:

grmnsort profile

Germán Rodríguez Ortiz

@grmnsort

Resident Git Archmage at the office, amateur kickboxer, Motörhead lover :) I totally love challenges and learning new things

Discussion

markdown guide
 

Hi German,

this is a nice read, and is probably what most teams have to go through when they are adapting a way to create code collaboratively. Just out of curiosity, I'd like some more details about the following:

you mention a couple of times "We weren't sure what Fast Forward strategy was", and also make understand that it is because of the strategy that the log was hard to read. Later, on an unexpected change of events, it seems that your team adopted Non-fast-forward as the way to go for master branch. Wouldn't it have been easier/cheaper just to sit down and try to completely understand what exactly did "Fast-Forward" mean?

I normally rather use FF than otherwise, and the log is readable as long as the developer on the team uses the commit messages effectively. My team is, currently, rather large, so using no-FF isn't really that cool because a large portion of the history is just "merge commits" that do not really contain, for instance, bisectable code.

It is possible that just with more descriptive commit messages your history might have been richer, regardless of the FF or non-FF.

PS. There is an interesting blogpost that rants against git-flow, which also proposes another way of doing things.

Clearly, there might not be a one-fits-all git workflow, but I find always interesting the discussion about "why teams do what they do the way that they do it?" in regards of collaborative tools

 

Hi Nicolas,

You're totally right, when you read the post, it seems that I put the responsibility entirely on the FF strategy that our log was hard to read, when I wanted to explain that our poor handling of the repository, led to a really messy git history when using the FF strategy. By that time we had no commit standards, so a feature or bugfix would be made of any number of commits, mixing "WIP" commits with actual "this is what I want to add to the master branch" commits. The FF merge would either result in the conflict resolution or the mixing of those unwanted commits in our history and since some of our wasn't properly updated would scatter the mentioned work across the whole log :( making it really hard to keep track or "undo" the merged work if necessary.

Just like you said I also believe it's really important to have a commit standard agreed across the team, we're using now the conventional commits syntax and it has really helped to improve our git history.

We actually ended up learning about the FF strategy, and I realized I should've talked a little more about that in the From there we started addressing each problem detected one at a time: section.

We ended up choosing the "non-fast-forward strategy" because, during our test runs, we agreed that the merge commit strategy made more clear to us what and when some feature/bugfix entered the master branch, it also made easier reverting features or bugfixes that were merged to the test branch with the use of tools such as GitKraken. And in the case that some dev forgot to keep any branch up to date, we could quickly notice and act upon.

I'll be trying to add these clarifications to the text ASAP :)

 

Hey German,

thanks a lot for taking your time and provide such detailed answer. I understand that there is a bunch of ways of doing things, and it is incredibly hard to have consensus over how to adapt the team workflow to the tools available.

I have had years of experience using Git with a rather large number of teams, and even though I purposely try to spend a long time at the beginning of a project defining the way we are going to collaborate (which others, normally a majority, think is too long), I notice that the time invested doing that really pays off over time, not only because we end up speaking the same language over the collaborative tools, but also because it hugely saves developing/testing/shipping time.

One day I heard (or read) something and I have never forgotten it, and even use the following quote in the Git workshops that I've held for teams new to the tool: "the Git history is part of the code, and it has to be treated as such". For example: just as you would reject a pull request with buggy code, a pull request with an amazing piece of documented and tested code might (and should IMHO) be rejected if it has superfluous commits or dubious commit messages, as this is something that the developers can fix themselves. This, among other things, might or might not be enforced by each team lead, depending on the maturity of the project.

Thanks a lot for sharing your experience!