DEV Community

Sagar Trivedi
Sagar Trivedi

Posted on • Originally published at Medium

Handling Incidents Mindfully 🧘🏽 β€” Part 1: Acceptance

Introduction

Namaskar πŸ™

Change is the only constant, this metaphor is applicable both in life and in software. Today we discuss one of the things which come as a part of the constant changes we deploy in the software, INCIDENTS.

In my work experience from 2015 to 2019, I was responsible for working on a web application that was responsible for the revenue of the entire organization. A downtime or incident was attributed to financial loss and brand reputation. Every change was high stakes with stress on each deployment. Having said that, we went from deploying twice a week in 2015 to deploying 2 times or even 5 times a day. There were many incidents along the way and I have learned a lot in terms of culture that helped me in handling these incidents and I would like to share these learnings with you. I am a strong supporter of meditation and have been practicing it for more than about 4 years and it has drastically changed the way I work and interact with people. Each culture shift or value we will discuss are mindful practices that I, my peers, and leaders (now ex) used to follow. In this series of articles, we will look at these practices and what an organization can do in terms of its culture or values to better handle incidents.

I am hoping that these values will help your organization irrespective of

  • The severity of the incident
  • Size and Complexity of the system
  • Organization size
  • Number/Type of users

We will start with the first and the most important step: Acceptance.

Acceptance

The textbook definition of an Incident is,

An Incident is something that happens which is especially something unusual or unpleasant

In terms of software development,

An Incident is an event that disrupts the normal operation of the system (Software, website, etc).

Many organizations in their mindset, are afraid of incidents and relate them with financial loss and damage to the brand among its clients or customers. After a major incident, the common reaction is to start putting additional checks and reviews in the deployment process. We add sync ups, release meetings, review boards, etc doing our best to make sure there are no incidents in production. While these checks and processes have their merit in reducing the number of incidents, another thing that is often overlooked is the impact it has on the speed or velocity in which you deploy changes. We need to know our risk appetite and decide on the red tapes we need to put on the deployments.

Aiming to reduce the number of incidents is a fair ask and adding gates and checks in the process is fine. But adding these checks aiming to eliminate incidents is a wrong conception. The only way one can stop incidents from occurring is to stop releasing changes.

Organizations tend to articulate and connect an incident to a bug, a gap in process or technical constraints, etc. What they should be doing is bringing a culture of Acceptance towards incidents.

What do we mean by acceptance? It means recognizing the fact that the real reason for any incident is because we keep on releasing changes or growing at a good rate, so as long as an organization is doing both, they should accept the risks that there are going to be incidents involved along the way. The first step toward incidents should always be accepting and normalizing them. Decide on the pace at which we want to deploy changes and also accept the risks involved in it. Once you accept the risks involved we can start looking at incidents as a part of the whole software lifecycle process.

True acceptance of incidents as a part of your deployment process can only be achieved through culture in an organization. Incidents should be looked at as a stepping stone to improvement and how things could be better. The only thing that should matter is how we respond to an incident and how we are making sure that the same mistakes are not repeated.

So with this, we conclude our first part. Do let me know whether you liked it, hated it, or have a different view towards incidents.

We will go to the next part soon, with a catchy Title β€œDo not React, Respond !!!”

Until next time.

Top comments (0)