With the cloud, we obviously can go fast, sometimes too fast. We can quickly stand things up via infrastructure automation; however, it can often lead to supportability problems. We have too many things and keep producing more and surprise, surprise no one is getting any less busy. Infrastructure automation is excellent in some cases, for example, short-lived projects, tossing up POC, or even services that are not business-critical. However, longer-lived projects and business-critical services need to support infrastructure as code (IaC). The monumental difference is the two words "as code," meaning:
"All the good practices we've learned in the software world should be applied to infrastructure. Using source control, adhering to the DRY principle, modularization, maintainability, and using automated testing and deployment are all critical practices. Those of us with a deep software and infrastructure background need to empathize with and support colleagues who do not. Saying "treat infrastructure like code" isn't enough; we need to ensure the hard-won learnings from the software world are also applied consistently throughout the infrastructure realm." - ThoughtWorks
The problem I see and hear across many organizations in this space is we are not dedicating resources to it, and often the pay now live later mentality that comes along with IaC gets shut down in favor of timelines. I also see that people are so embedded with their current work, done their way, using their tools that they are their own worst enemy, creating systems that only they can support. That could be from a complexity standpoint or even an access/control standpoint. That means they don't even have the time to learn something new like IaC and in some cases do not want to. Simply put, we are just stumbling over the existing cloud implementations and the comfort of doing things the same old way.
Back when we were ordering and racking servers in the data center, you could see workloads, security issues, and problems add up. Now, with the speed, flexibility, and ease of cloud, these things are multiplied, and you can see it. I think Yevgeniy Brikman, Co-Founder of Gruntworks, describes the problem well in the first 3 minutes of his story he tells back at HashiConf'17. The funny thing here is, he's only talking about one application on a single cloud provider, and it's overwhelming. Guess what, if you're in any organization larger than 500 people, I bet you have at least a few applications, if not hundreds, even across different cloud providers.
I have some hope though, hope that treating infrastructure like code is a solution. With a dedicated cross-organizational team of development, operations, testing, and security with a focus of continuous improvement, the idea that our applications' infrastructure foundation can benefit from "the hard-won learnings from the software world" can be realized!
I recently started this adventure and would be interested in hearing if IaC has worked for your organization. What are some of the essential practices you've adopted from software development? I'd also be interested in learning about some of the challenges and issues (people or technology) you had to solve along the way.
Top comments (2)
If you haven't checked out Pulumi yet, I highly recommend it. Disclaimer: I work for Pulumi. But I also use Pulumi myself for many of my cloud projects.
I'll check it out. This post is not so much about the tools but more about process. That being said the tools usually have a lot of influence on the process. Terraform has been the go to tool for everything I've done to date mostly because of its abstraction and simplicity, it's very non developer friendly. Fnding people seem to be a struggle with lots of organizations so Terraform is a lot less scary for the existing resources who don't have development skills yet.