The complexity of modern DevOps environments requires tooling that enables engineering teams to have maximum visibility into the health and behavior of their workloads. Having an acute understanding of the "moving parts" of an application means issues are resolved faster. However, it's often the case that outages or unexpected behavior are a result of internal change, like code commits. Engineering organizations struggle to understand and account for the sum total of change in their environments.
Utilizing commit-based visibility means that each unit of change (commit) can be identified and tracked across their entire lifecycle, as well as multiple environments. Being able to pinpoint individual commits across an entire engineering organization has the potential to provide meaningful gains in critical metrics like Mean Time To Resolution(MTTR). Organizations that can drive down MTTR will improve the ever crucial customer experience.
Implementing commit-based visibility means first understanding the underlying instrument of change: the commit. Afterwards, understanding the lifecycle of a commit and how it's deployed will reveal an opportunity to improve visibility even further.
A commit in this context refers to a specific process in which a change in the source code is sent to a Version Control System(VCS) repository, where it is hashed and stored as part of a continuous log of changes. In the course of a single day, development teams may push thousands of individual commits to multiple repositories.
Any one of these commits, at a basic level, represent a net change to the destination environment. The code could be part of application source code, or perhaps represent configuration and orchestration code for infrastructure. The key point is that any change has the potential to bring about unexpected behavior, including an outage or interruption in service. In the past, operations teams were often forced to spend critical minutes tracking down the at-fault change through a variety of version control systems and deployment automation.
A single commit contains only a limited amount of metadata about itself, which has limited usefulness in understanding whether it could be implicated in issues with a wider scope. While most commits make it possible to identify the original author, they provide very little actual information around the context of the change, and under what circumstances it was requested. Sparse commit messages can result in an even greater mystery around the actual intent of a given commit.
Implementing a system with Continuous Integration/Continuous Delivery(CI/CD) provides a wealth of features to test and integrate code before it is deployed to critical production workloads, helping teams avoid issues resulting from untested code changes. Successful implementation of a DevOps initiative typically depends on a well-functioning CI/CD infrastructure.
CI/CD systems are a critical pillar in DevOps. They allow individual commits to be tested and deployed in an appropriate level of isolation. CI/CD pipelines can be modeled to fit a variety of development and deployment scenarios, and provide fast feedback on interoperability and overall code quality. Earlier stages of a pipeline typically focus on smaller scale unit tests, with some checks occurring on the local development machine before code is pushed into the pipeline.
The example pipeline is from Jenkins, a widely used CI/CD platform. In this instance, several stages and pipeline components have been configured, weaving together different testing and isolated deployment steps into a cohesive suite of release automation. Individual commits make their way through each appropriate stage, providing commit-based visibility throughout the pipeline. Developers and other engineering teams can receive immediate, visual feedback on the health and state of a commit at any part of the pipeline.
Unfortunately, a single pipeline or CI/CD instance may only represent an individual team or workgroup, or may only be a specific application or production service. Even within a single application stack, infrastructure and application code may be deployed through separate CI/CD pipelines. In a distributed system, outages may not present an immediately obvious cause when a variety of changes may be flowing into the environment simultaneously. Multiple engineering teams may maintain their own decentralized CI/CD infrastructure, making holistic visibility across all pipelines and platforms difficult to achieve.
Fully-functional CI/CD infrastructure brings massive gains to previous logistical paradigms in software development and deployment. However, modern distributed systems typically require multiple CI/CD instances to fully cover every corner of the stack. In those types of environments, ReleaseIQ tooling can tie disparate CI/CD infrastructure into a single, unified "pane of glass" view.
When a breaking change is introduced to an environment, MTTR is crucial: it's very likely that customer experience is at stake. Long interruptions to critical services may leave customers dissatisfied and heading for the exit. With a single source of truth for identifying and remediating breaking changes, operations and development teams can work in close concert to quickly resolve issues resulting from malformed changes.
One of the main objectives of DevOps is to remove the logical barriers that have traditionally existed between development and operations teams. In legacy environments, development teams often kicked their completed code "over the wall", leaving operations teams scrambling to integrate and test new features, often incurring long hours debugging unexpected behavior and performance issues. CI/CD helped bring down that wall to some degree, but offering commit-based visibility across the entire pipeline environment provides a new level of shared ownership between operations and development.
Being able to identify broken changes across engineering organizations can also provide longer-term statistical data for identifying trends. If certain teams or application stacks are more prone to introducing negative change into the environment, additional engineering resources can be focused to help improve testing and reliability. Without this level of holistic visibility, it will be much harder for leadership to know where to focus a limited pool of resources. In the case of commit-based visibility, problem areas surface themselves in an easy to understand system of visualization and metrics.
With commit-based visibility, stakeholders and engineering teams alike have better visibility into completed, ongoing, and planned changes. When a change-induced outage occurs, identifying and rolling back the "at-fault" change ASAP is key for customer experience. As with any new initiative, getting buy-in across technical and non-technical stakeholders is critical: engineer teams should be able to demonstrate clear business value in justifying additional resource commitments. With commit-base visibility, there is a clear line of value drawn from reduced outage minutes to customer experience to greater realized value.
Over time, improved commit-based visibility will improve overall code quality and reduce operational firefighting, resulting in happier and more productive engineering teams. Another benefit is increased release velocity. Organizations that stay successful and profitable long term focus on delivering more features to customers faster, safer, and more stable.