Before saying why GitOps is so useful, I should explain what it is. At its most basic, GitOps is defining all the configuration for your system in Git. This includes deployments, versions of services, config and secrets. For example, if using Kubernetes, a GitOps repository will have some Kubernetes deployment files which define the images and versions to deploy. Each deployment file may also reference config maps or contain environment variables which specify the config required to run that service in each environment.
All this means you have a declarative way of defining the state of your system. It’s like saying, “I want the whole system to always reflect the settings in these files”. This is the opposite to the imperative way of defining your system where you tell you the system how to change e.g.
helm deploy or
helm rollback. In the imperative way, the system is a sum of all the things it has been told.
In GitOps, changes to the system are pulled in (via agents) rather than pushed in by developers, or a CI issuing commands. Agents run in the cluster, comparing the actual state of the cluster and to the state defined in Git. If there’s been changes to Git, it will apply them to the cluster.
Agents are also used in the deployment of services. New versions of services are deployed by tagging an image in the docker repository. An agent will be scanning for new tags in that repository. If the tag matches a predefined pattern then the tag gets applied to the deployment files in the Git repo and that change will be picked up and applied.
So why use GitOps?
It’s really easy to find and share the configuration of a service. I’ve had many instances when someone has asked me, “what configuration does Kafka have set?” Using GitOps doesn’t require me to look at the deployment descriptor in Kubernetes. I search in Git. When I’ve found it, I can easily share that file as a url with anyone who needs it - they don’t need special access to be able to inspect what’s running. GitOps is a great leveller, and it gives everyone the ability to see how things are running.
When all changes are made through Git you have a ready-made history of all the changes to the system, including some description of what or why the changes happened in the commit messages. When something breaks, it’s then easy to look back at the last change and figure out what may have caused the problem.
Git revert then becomes an instant rollback mechanism. So you can feel confident making changes knowing it can easily be undone too.
Git is a familiar tool to most developers and using GitOps saves having to learn a new CLI with all its syntax and options. It does still need developers to know how to define the deployment/configuration files though e.g. you need to know how to structure a Kubernetes deployment yaml file if deploying in Kubernetes - but at least you don’t have to learn how to use Helm too!
Having all your configuration and state in a single place is the first step to making your environments rebuildable. When an environment rebuilds it doesn’t require a CI to kick in at some point and deploy all the latest apps - the agents that are already running simply apply the config that’s already in Git and the system will be exactly as it was before.
I know it’s so tempting to make a sneaky change, in place, to try something out:
kubectl set env .... If it works, “wonderful!”, but it’s possible you forget to apply the config to all services or add it to CI and then when there’s a cluster rebuild, or even a new deployment that change may be lost. It also works the other way. If an app breaks, you don’t need to check if any config has been manually set which is causing the problem. Git is the single source of truth for how your apps are running and that gives a warm sense of security and certainty about being able to reproduce, recreate and replicate entire systems in other environments, without any manual intervention.
Changes happen via pull requests so all changes are visible and can be discussed. GitOps is a great way to share knowledge because everyone can see how the system is being configured - no one can make changes secretly.
Anyone can change the state of the cluster, it just needs a pull request and some approvals. This is great because it means the organisation doesn’t rely on a few individuals to make changes, anyone can make the change. This is truly self-service. It also means people don’t need special permissions to be able to make changes. So you can lock down permissions whilst still allowing all developers to make changes via PRs.