Writing infrastructure as code shows many of the same challenges as writing code for application development, because many of these challenges are not language or use-case specific.
Terraform and its surrounding ecosystem are still evolving and share many similarities with early PHP and the web. Just like PHP evolved by learning from other language ecosystems, Terraform can as well.
Use-case specific frameworks are a major driver of innovation, improved developer experience and productivity on the application development side. But are not yet established parts of the infrastructure as code ecosystem.
The paradigm shift to containers and Kubernetes made use-case specific frameworks possible for infrastructure as code by providing a powerful abstraction between application and infrastructure layer. And the cloud native community is evolving rapidly, extending this abstraction to additional use-cases.
Organizations that adopted application development frameworks for their improved developer experience and productivity, can leverage the same benefits for automating Kubernetes by using an infrastructure as code framework and avoid leaving the cluster the weakest link in their GitOps automation.
PHP’s ease of getting started is widely quoted as the boon and bane of the language. It seems as if making fun of the spaghetti code bases of the early PHP days never gets old. Even in 2021. But there is no doubt that PHP is an extremely successful programming language.
You may ask, what does this have to do with Terraform? Well, hear me out. Terraform and PHP have more in common than you may think. PHP was created when the web was in its infancy and quickly became extremely popular. Don’t forget, PHP is the P in LAMP stack. Similarly, infrastructure as code is still an emerging ecosystem today, and Terraform is by far the most popular language in this ecosystem.
But the modern PHP of today is vastly different from the early PHP we all like to make fun of. And since Terraform today is so similar to where PHP was when it started, there’s a good chance that the Terraform community can learn a lot from how PHP evolved.
Rasmus Lerdorf, the creator of PHP, is famously quoted as never having intended to write a programming language. But PHP got popular and they had to keep going. In addition, the web and its request-response model were new, even to experienced developers. But the endless possibilities of the web got people excited, and the unintentional programming language PHP was easy to get started with. This combination led to the stereotypical poor quality code bases that ended up powering major parts of the early web.
Similarly, infrastructure as code offers huge benefits and gets people excited as well. But it also requires both operations and coding experience, and people coming from either one background have to learn a lot about the respective other, before they can be fully productive.
Languages like Python released a few years before PHP, or Ruby and Java, which were released in the same year as PHP, were intentionally designed programming languages for professional use. While not specific to the web, it is of course possible to build web applications in either one of them. So the self-evident thing was to use these more mature and consistent languages to build web applications, and have more easily maintainable code bases as a result.
And not only were the languages more mature, but so were their ecosystems. The majority of challenges, developers face when writing code, are not language specific. And many are not even use-case specific. You may need different dependencies for building a web application instead of a desktop application for example. But in both cases having dependency management is greatly useful. A feature Python, Ruby and Java all already had.
This led to the creation of frameworks like Django, Ruby on Rails or Spring that made it easy to build web applications in Python, Ruby or Java respectively, leveraging their existing language ecosystems.
A great idea that works in one ecosystem, however is quick to inspire similar development in other languages. And PHP’s wide adoption easily justified major investments to improve the PHP core as well as the surrounding ecosystem. All those teams looking for the best way to maintain their growing PHP code bases were smart to look at other languages and how these same challenges were solved there.
The result are frameworks like Symfony or CakePHP, heavily inspired by Spring and Rails respectively. This is also how Composer brought modern dependency management to PHP. And last but not least, this was when the PHP community adopted Git for version control and slowly moved away from just editing production files directly via FTP.
Let's get back to infrastructure as code. Yes, in a lot of ways automating infrastructure is different from application development. But many of the challenges of writing code, that applied across languages and use-cases on the software development side, also apply to infrastructure as code. Code is kind of the keyword here.
So just like PHP learned from other languages, their frameworks and their tooling, Terraform can only benefit from doing so as well.
One area where Hashicorp, the makers of Terraform, recently made major improvements is dependency management. Terraform had the ability to download required providers for quite some time. But it was limited to only Hashicorp’s own providers. Community maintained providers required involving, manual installation. A recent Terraform release introduced support for registry namespaces, which means community providers can now also be installed from the official registry. In addition, required providers and versions can now be specified more explicitly. Even including the ability to vendor providers, and thereby hardening automation runs against failing when the registry is unavailable.
All the language ecosystems we discussed share one key piece that heavily improves the developer experience, but which isn’t a thing yet in the infrastructure as code world. I’m referring to frameworks of course. And concretely use-case specific frameworks. By being use-case specific, the aforementioned software development frameworks drastically reduce upfront and maintenance effort, and provide the best developer experience and workflow possible.
If I’m building a cloud native application in Java, using Spring Boot will make my life much easier. Likewise, if my goal is to build a Jamstack website, a framework like Gatsby will get me there much faster.
But the reason why frameworks are not a thing in the infrastructure as code world yet is not merely that the ecosystem is still evolving. For frameworks to be useful, we also required a strong abstraction layer that kept the infrastructure layer clear from application specific requirements. Containers and Kubernetes are extremely popular because they provide this very abstraction. And this means two things: First, that with using Terraform to manage Kubernetes there is a popular and very specific use-case for an infrastructure as code framework. And second, that because of the powerful abstraction, such a framework makes sense for the first time.
Kubestack is this use-case specific, Terraform GitOps framework. If you’re building GitOps automation for Kubernetes cluster infrastructure and cluster services using Terraform, Kubestack may be the framework for you. Think of Kubestack as the Ruby on Rails of infrastructure automation, the Gatsby of GitOps, or the Spring Boot of Terraform and Kubernetes.
And just like application frameworks copied ideas that worked well from one language to another, Kubestack does the same from application development to infrastructure as code.
One example is Kubestack’s convention over configuration based repository layout. Another one is its inheritance based configuration to prevent drift between environments. A third one is the ability to easily vendor dependencies in the repository, like the Nginx ingress controller or Prometheus monitoring operator. Or, as the last but not the least example, local development environments that automatically update as you make changes to the code.
Slow feedback loops are poison for developer productivity. And infrastructure as code is notoriously known for mandatory, slow pipeline runs. This makes the local development environment the perfect example how Kubestack drastically improves the developer experience, because it’s a use-case specific framework.
The strong abstraction between the application and infrastructure layers is a key mantra of what we know as cloud native. And if you take a look at recent developments from the cloud native community the direction is clear. As more and more organizations shift their workloads and use-cases to cloud native, we continue to see new innovation and iterative improvements that extend this powerful abstraction.
This is both positive for the future of infrastructure as code and Terraform as well as for use-case specific infrastructure as code frameworks.
Systems that provide a separation between declaring desired state and current state are the current state-of-the-art. This is a core principle of Kubernetes and high-level managed cloud services, but also of VM auto-scaling groups, as a lower level example of this principle. On the surface there’s an API to declare the desired state. And behind the API are control loops that keep the current state in sync with the desired state.
Terraform shines when being combined with such a system, because it is great at planning and applying changes triggered by a commit in a repository. And it can also be run periodically, to detect drift and either alert or overwrite. But when operating distributed systems, there are various failure scenarios where continuously running controllers, that can take immediate action based on more events than just code changes, are clearly superior. The important thing to understand here is, Terraform is great to provide a way for teams to reason about proposed changes and keeping the committed state and desired state in sync. But keeping desired and current state in sync is, in most cases, better left to a continuously running control loop.
It’s common for teams to hit this limitation when using infrastructure as code to automate legacy systems that don’t provide this separation of concerns. And this frequently leads to automation that only manages the lifecycle partially and causes complex issues for teams to coordinate automation and manual operations. Facing this significantly limits the value of infrastructure as code, and many teams justifiably may hold back on adopting Terraform for this very reason.
But Kubernetes or managed cloud services are not the only systems that rely on declared desired state and reconciliation loops to keep current state in sync. An example doing this for infrastructure automation outside the cloud provider’s walled gardens is ClusterAPI. This cloud native community initiative aims to provide the same separation across on-premise and cloud. And through integration into vSphere, ClusterAPI is readily available to VMware’s vast installed base.
As an industry, we’re clearly heading into one direction. And as we continue to adopt this paradigm, the limitations that held infrastructure as code back, when working with legacy systems, do not apply any more. As infrastructure as code becomes more viable for more organizations, more teams can benefit from use-case specific frameworks to get the best possible developer experience and productivity.
Already now, many teams are using Terraform successfully. Yes, there are edge cases to consider and there is a steep learning curve, no matter if your background is in operations or software development. But as the cloud native ecosystem continues to evolve, the benefits of infrastructure as code will be applicable to more teams and more use-cases and just like PHP grew by learning from other language ecosystems, Terraform will too.
As far as Kubernetes is concerned, if you’re already adopting GitOps, the Kubestack framework is an opportunity to implement full-stack GitOps that covers both the cluster infrastructure and cluster services and not just the application workloads on the cluster. This way, you can avoid having the foundation of your system, the cluster, be the weakest link by not managing it manually via UI.