TLDR; Git is an incredibly powerful tool for any developer. It will help you in many ways. Git can be complicated at times, but don't let that put you off.
If you’re new to programming, you're probably wondering: What the heck is Git? Or -- based on people I've spoken to -- you've never even heard of Git. Both of these are completely understandable. After all, Git is rarely mentioned in most beginner tutorials.
My aim for this post is to explain what Git is and why it is so useful. To keep things simple, the post will be light on Git commands. By the end, I hope you’ll have a feel for how Git can help you today and in the future.
Let's get to it.
- What is Git?
- The Problem: Files Everywhere
- The Solution: Version Control
- What Makes Git Special?
- The Down Sides Of Git
- Common Concerns
- Useful Links
In short, Git is a version control system, used by millions of developers around the world. Git is an incredible tool -- many programmers (myself included) wouldn't work without it.
But that is a terrible answer. For a new programmer, the term ‘version control’ is meaningless.
So, before diving into Git, we should first look at version control and the core problem it tackles. This is a problem that you’ll likely have experienced already.
It’s a common scenario. You’re beavering away on a project and are happy with the code as it stands. It looks slick, and its features work nicely.
But there’s a problem: you need to make a change. This change is a big one, and might break the code that you’ve worked so hard on.
So what do you do?
If you’re anything like I was at first, you’ll make a copy of your file:
Now, if your change does break everything, you can go back to the original version. Great, right?
Well yes … and no.
This system -- we’ll call it manual versioning -- works up to a point. However (as I learned the hard way) you soon run into a world of pain. Pain such as:
- You soon create another backup file. And another. And another. Before you know it, your directory is flooded with different versions of the same file.
hatShop.js hatShop_v2.js hatShop_final.js hatShop_actual_final.js
- Adding a file to the project is dangerous. With every new file, the complexity shoots through the roof. And how can you keep these files in sync?
capShop.js capShop_v2.js catShoop_.js hatShop.js hatShop_v2.js hatShop_final.js hatShop_actual_final.js
When your code breaks, it’s damn near impossible to identify which change did the damage. You end up making extra backup files so you don’t break anything during your investigation
Your files are littered with commented-out code, because you don’t want to forget anything that might prove useful. This becomes noise (and it never proves useful)
If you're partway through a feature and need to suddenly work on, say, an urgent bug ... well that means even more version files
Horror! You start working with another developer, and you both want to edit the same file. How do you avoid trampling on each other’s code?
Someone else has made a difficult-to-understand change in the code, and has gone on holiday. WHAT WAS THAT CHANGE SUPPOSED TO DO?
By the end, your folder looks something like this:
capShop.js capShop_v2.js catShoop_.js capShop_bug_hunt.js hatShop.js hatShop_v2.js hatShop_final.js hatShop_final_with_button.js hatShop_actual_final.js hatShop_actual_real_i_mean_it_this_time_final.js hatShop_from_leah.js hatShop_now_broken.js
Fortunately for us, these are age-old problems which people have been working on for many years. Their solution? Version control.
Version control -- also referred to as source/revision/change control -- is software that manages changes to your code. In other words, version control takes your manual versioning system and does all the hard work for you. Version control takes a load off your mind.
There are different version control systems -- such as Git, SVN and Mercurial. Most of these work by holding a copy of your code in what's called a repository. The system tracks changes to this.
Whenever you’re happy with a change to your code, you commit that change to the repository. Each commit -- called a revision -- often contains multiple files. These commits build up to let you see and restore any changes you’ve made.
From this, version control gives you some awesome things:
This is the most immediate and obvious benefit: Version control is like an Undo for your code. Except it’s better, because it doesn’t forget your changes anytime you reboot your machine.
This is amazing -- it frees you up to experiment and have some fun. Screwed everything up? No worries, just rewind back to your last commit.
The cost of storing a change is practically nothing. There are no backup files clogging up your directory.
As a result you could -- and probably should -- commit tiny improvements at a time. Removed some nested if-statements? Commit. Renamed a temp variable to something more meaningful? Commit. Deleted some commented-out code? Commit commit commit.
Once you get into the habit of making small commits, it becomes easy to pinpoint any specific change that breaks your code.
Version control systems only store changes. They don't make a separate copy of the file with each commit. This means you can commit as many small changes as you like without quickly taking up space with duplicates.
Ever found a line of code and asked: What was this line trying to do? Version control solves this problem.
Alongside every code commit, you include a message which describes the reasons for that change. This message is like a detailed code comment, with the advantage of not adding any noise to your actual code.
This means you can see who changed every line of code, when, and why.
This history builds up over time, so is something that you really appreciate as the project grows. Then -- like a kind of programming Indiana Jones -- you have a chance to rescue some code artifact from ancient times.
(I can vouch for this one, having recently dealt with a bug introduced 11 years ago. Without this code history, I wouldn't have stood a chance.)
Version control makes it easy for two or more developers to work on the same file without breaking each other’s work. This is done by merging each developer’s changes -- on a first-come-first-merged basis -- to the version of the file in the repository.
If those two changes are in different parts of the file, then this process is automatic. The same is true when developers are working on different files -- they can get other’s changes while barely lifting a finger.
Even with version control, things can get hairy. If two developers have changed the exact same part of the code, then this has to be sorted out by hand. This is the dreaded Merge Conflict, and it’s a painful fact of programming life.
Version control systems often come with ready-made tools, such as Jira, GitLab, BitBucket, and (of course) GitHub. These complement the version control system and/or act as a layer on top.
What do these tools give you? For a start, there’s the Issue Tracker which lets you record and describe, well, issues. Each issue -- generally a bug or feature -- will have an issue number, which you then mention in the commit message. This lets you trace through from Code to Commit to Issue with ease.
Add to that: task management tools, wikis, funky graphs, and whatever else you can dream up. The tools are there; version control lets you use them.
One advantage of having a repository is that it acts as a backup of your project. If this is stored off-site, then you should be pretty safe from most worries.
This also makes it easy to get new teammates up and running. Simply clone your project onto their machine and hey presto. (Usually with a small bit of jiggling around … though it sure beats copying by hand.)
Because each commit often contains more than one file, those changes are grouped together. No more worrying about which change in file A goes with which change in file B.
Gone are the many backup files that flood your directory. Gone is the commented-out code that’s there “just in case”. It has no reason to stay*.
(*Oh sure, someone can always find a reason … but it had better be good!)
Hear that? Sweet, sweet silence in your codebase. Now you can program in peace.
You get the picture. Version control is a wonderful thing. It removes a lot of the risk and stress from software development. Programming without version control is like skydiving without a parachute. Even if you survive, you’ll likely find yourself stuck up a tree.
In fact, my original plan for this article was to write about version control as a whole. After all, any version control -- be it Git, SVN, or Mercurial -- is better than no version control, right?
Wrong. I’d be selling new programmers short. It’d be like writing about search engines and not mentioning Google. Without wishing to start a flame war, there’s one clear version control system to learn: Git. Let’s talk about why.
“I decided that I can write something better than anything out there in two weeks ... And I was right.” -- Linus Torvalds
Git was created by Linus Torvalds -- the legendary sharp-tongued inventor of Linux -- in 2005, to help develop the Linux kernel. As the quote suggests, he set out (in his own way) to improve upon existing version control systems at that time. Many would argue that he succeeded.
Since then, Git has become the industry standard. The language of Git -- push, pull, fetch, fork, etc. -- has become everyday terminology. Git is so popular among developers that it would take a seismic shift to replace it. And that replacement would have to be something pretty special ...
So what makes Git stand out? Here are just some reasons:
Okay, this one sounds dull at first (particularly if you’re working solo) … but it is the core principle behind Git.
Git is a ‘distributed’ version control system. In other words, every developer has a clone of the repository on their machine -- complete with change history, commit messages, etc. This is their ‘local’ repository.
This is very cool, because it means you can commit a change to your own repository without breaking anyone else's code. It also means you can work offline, which is important for some people, less so for others.
(For the record SVN, which is 'centralised', doesn't let you do this. At least not without cheating. Mercurial, like Git, is distributed.)
From this distributed architecture other features appear, including the next one. It might be my favourite feature of Git: its branches.
Branches in Git are mind blowing, especially having come from SVN. Git Branching was the feature that first won me over.
You can think of a branch as a new instance of your project -- effectively another repository. This lets you work on different jobs at the same time, without fear of cross-pollination.
Developing a new feature? That can be a branch. Some experimental R&D? Another branch. Need to patch up a bug? A third branch. These branches are entirely separate from each other.
Commit your changes to the relevant branch, giving you a complete history. Finally -- when you’re finished with one of them -- you merge it back to the main branch.
And boy is this easy to do in Git.
It has to be said, branches are not unique to Git. SVN has branches. The difference is in the effort involved. With SVN, branches are a chore to manage. They take an age to set up, and are a horror show to merge back.
With Git, branches are a joy. Set up and merging both happen within a blink*. The result? You use branches as nature (and Linus) intended.
(*Okay, busted. That isn't entirely true. Git still has merge conflicts -- you do have to get your hands dirty sometimes. Still, these moments are rare when compared to SVN.)
Git is lightning fast. Having moved from SVN to Git, the speed is noticeable. Even cloning a repository seems to take no time by comparison. Throw in the ease of branches and merging, and you’re laughing.
I have heard that Git struggles with huge projects with, say, millions of files. This may well be true -- I wouldn't like to be around to find out.
Git views changes in terms of the whole project. As a result, moving a directory (or renaming a file) in Git is simple. By contrast, SVN thinks in terms of individual files. Moving a directory (or renaming a file) in SVN is knotty, and often means waving bye-bye to your history in the process.
Git is the version control of choice for the Open Source community. Its whole ethos revolves around open source.
GitHub is by far the best example of this. It is filled with awesome people doing awesome work, all for free and the genuine goodness of their own hearts. You -- yes, you -- could go on it now, and contribute to any (public) project. That’s pretty awesome.
( … And this is where I have to confess: I’m not an open source contributor. Sure I feel some guilt about it, but there is only so much you can do at a time.)
Big corporates too -- such as Microsoft and Google -- have jumped aboard the GitHub bandwagon (to the extent that Microsoft bought it). Fancy peaking under the hood of Windows' Calculator? Well you can do that right now. Please, contain your excitement. (I'll admit, I let out a little squeal.)
Git seems to have an endless supply of commands that can help a developer on a daily basis. For example, need to pinpoint which change introduced an error? Try the bisect command.
For those developers from the UK -- yes Git really is called Git. You'll never get used to it. Just accept it and move on. As far as I know, there’s no version control yet called Prat or Plonker.
Switching from SVN to Git was -- no joke -- life changing. I’ve experienced the difference it makes. Once you’ve used Git, it’s tough to go back. Fortunately, you won’t have to very often.
Git by XKCD
Let's cut to the chase: Git can be complicated. In fact, it can be downright terrifying. Certain commands give you cause for prayer before using them. I'm looking at you, Rebase!
Git can sometimes be too clever for its own good. For example, I've never been sold on its Staging Area. Why add a second step to the commit process? (Many would disagree with me here -- I've just not found a good and safe use for it yet.)
The good news? You can get by in Git with the basics, especially when you're just starting.
Also, the tools can hide a lot of complexity. I suspect that improvements to Git will rely on changes to its tooling.
Anything else? None that I’ve encountered yet. If you've experienced more, please do let me know.
Here are some common concerns about Git that I’ve come across, along with my thoughts:
I’ve only just started programming. Do I need to learn Git as well?
You got me! In this case, the answer might be no. Write code, play around and have fun.
However, the moment you write some code that you want to keep, then Git will be your lifeline.
Also, it’s worth signing up to GitHub now. There’s no harm in doing so. Have a look around. Commit a few files -- they don't even have to be code.
I’m working on a solo project. Git seems like overkill.
This can be true ... but not often. Yes, Git can feel like overkill when you’re working alone. But a lot of its benefits apply to solo projects.
If the code is truly throwaway, then setting up a Git repository might be a step too far. That's a judgement call.
I’ve heard Git is complicated
Oh indeed! Git can be confusing as hell. However, you don’t have to use its advanced features from day one. Instead, start off slowly, and commit small changes. Once you become more confident, then you can explore what else Git gives you.
Git looks like a bunch of confusing command lines
I’m going to admit something here: I rarely use the Git command line. Much to my shame, I cannot claim to be a Git master. And yet I still find it vital to my work.
Most of the time, I’ll use a Git GUI which hides all of this from me, such as Sourcetree, GitKraken, or SmartGit. (That said, it’s worth knowing a bit about the command line -- it can get you out of a hole.)
Git sounds expensive
Git is free! Sure, there are ways to spend money. But that’s not a worry for now. Stick to GitHub, and you’ll be grand.
I’m partway through a CS degree, and Git hasn’t been mentioned. Can it be that important?
I’ve recently come across graduates whose degrees didn’t mention Git (or even version control). As far as I'm concerned, that degree has failed you. Start using Git now -- it will help.
There must be other popular version control systems?!
There are: Mercurial and Fossil to name a couple. These might well be excellent systems (I've not used either). By all means explore them. At some point, though, you will probably need to learn Git.
I’ve been programming for years, and have never needed an old version of my code
Then you are a far better programmer than me. Please tell me your secret!
Ok, so you're ready to get into Git. Here are some links to get you started:
GitHub -- The Git hosting site. Sign up now.
Code Triage -- Help find an open source project
And of course Dev.to has some excellent resources, including:
Git is an incredible tool -- giving you the freedom to create, change and share code. Yes, it can be complicated. Just take it easy at first. You'll soon be glad that you took the time.
Banner Credits - Banner Generator by Christopher Kade
Artwork Credits - ls.graphics