DEV Community

Cover image for Revisiting the Pain: The World without Git

Revisiting the Pain: The World without Git

Introduction

Computers have evolved a lot in a short span of time and are still evolving rapidly. With the advent of technology, the world is now at our finger tips. Along with these advancements in the hardware sector, software development has also progressed a lot. And when we talk about software development, how can one forget Git?

Git is the single most-used version control system(VCS). Git is superfast and it seems as if Gods of speed have blessed Git with unworldly powers. If you're a Software developer, it's your bread and butter. But Git was developed in 2005(with a blazing controversy) and software is being developed for more than 7 decades now. So, how did the development world look before Git.

Let's dive right into it.

1) Maintaining Backups locally

Folders

In earlier stages of software development, there was no existence of a version controlling system. Daily or Weekly backups was the norm.

For a collaborative project, there used to be a master copy usually under a single person's control. Everyone worked with that person.
Back in those days, a Commit meant backing up the current version and applying the new changes.

This came up with 2 major problems :

1) It was hard to know what changes you made over time. There was no other option than manually checking the files for differences.

2) Overtime, the files would become so large that making multiple of them may eventually take up your whole disk storage.

2) Local VCS (LVCS)

All the tasks without VCS were tedious. It required more effort from release team and the project leaders. To solve this, programmers developed Local VCS. They are also referred to as "First Generation VCS".

Local VCS

As the name suggests, it had a local database to store the changes made to the files. One of the most popular local VCS was "RCS(Revision Control System)" which stored patch sets and could reproduce how the file looked at any point of time by merging those patch sets.

Another popular first generation local VCS was "SCCS" or Source Code Control System which was indeed the first VCS developed in 1972 by UNIX developers.

3) Centralized VCS (CVCS)

Local VCS were good or at least better than maintaining backups but even it had a problem. Everything was stored locally. Incase if the system crashed and there's no backup you would lose everything. Also, there was no way to collaborate with other developers.

To solve this issue, Centralized VCS was developed. They were the "Second Generation VCS"

CVCS

Here, we had a central server which contained all the versions and commits ever made to the project. Everyone could push and pull the changes. As the server was centralized, almost everyone knew what the other person was working upon.

Along with all these pros, CVCS also came up with cons:

1) As everything was centralized, if there's a server failure or the hard disk of the server gets corrupted, you lose everything(if there are no backups). You may get the latest version of the project from some developer who committed the last change but all the previous changes are gone.

2) Also, due to centralized server, you need to have an Internet connection if you want to commit your changes. So, if you're working in a place with no internet, you may have a bad time.

Some of the examples of Second Generation VCS are "Apache Subversion(SVN)", "CVS" and "Perforce".

4) Distributed VCS (DVCS)

Everything earlier came with it's pros and cons. But there was a need of something even better and that's when Distributed VCS came into picture. It was also referred to as the "Third Generation VCS".

DVCS

Here, everyone had the full backup of the project both locally on their machine and also on the centralized server. As everyone had a local copy of entire work history, one doesn't need to be online to commit their changes. They can commit their changes to local repository first and whenever there's an availability of internet, they can push those changes to the master repository(remote repository).

Some of the examples of Third Generation VCS are "Git", "Mercurial", "Bitkeeper" and "bzr".

A Short history of GIT

Linux development started in the year 1991. Until the year 2002, changes were passed as patches and archived files. This was very difficult for the developers looking at the scope of the project.

So, finally in the year 2002, they began using "Bitkeeper VCS" which was free-to-use at that time. Everything was good and the Linux development was going smoothly up until 2005 when Bitkeeper's copyright holder Larry McVoy revoked the free-of-charge status after claiming that Andrew Tridgell created Sourcepuller by reverse engineering Bitkeeper's protocols.

This was the time when Linus Torvalds, the creator of Linux, thought of developing their own Distributed VCS with the features they needed. The development of Git began on 3 April, 2005 and achieved it's performance goals on 29 April, 2005.

This led to the development of the single-most used VCS in the software development world.

Credits

Images: Pro Git

References

Pro Git eBook: https://git-scm.com/book/en/v2
Wikipedia: https://en.wikipedia.org/wiki/Git

Discussion (3)

Collapse
buckldav profile image
David

Git was developed in a few weeks? Geez...

Collapse
tony199555 profile image
Tony Yu

A legend does what legend does Haha

Collapse
mmi profile image
Georg Nikodym

You focus on the local vs remote, central vs distributed attributes of the various systems while missing an important piece.

In addition to lack of network awareness (because networking itself was in its infancy) RCS and SCCS are file only. You could have 1.45 of fileA and 1.64 of fileB and so on. There was no ability to snapshot the state of a source tree.

CVS was the first system that attempted to address this but it was still pretty cumbersome.

The idea to bundle a collection of file changes into a unit (that could be reproduced) didn't appear until SVN, Perforce, BitKeeper (there were others like arch and monotone but I've never used them so can't offer anything worthwhile) and later Mercurial and Git.

The "invention" of the changelist/changeset/commit was a super important development for anybody dealing with giant source trees. The obvious reason being that it allowed for better understanding of changes in a system. The less obvious reason, to most developers, was that it greatly reduced the amount of network traffic required to compare two trees -- effectively enabling the distributed architecture we take for granted today.