DEV Community

Cover image for Beginner's guide to Chaos Engineering
Shatakshi Gupta
Shatakshi Gupta

Posted on

Beginner's guide to Chaos Engineering

Have you ever wondered what firms/companies do or would do when their product or service that they are offering to customers suffers from Downtime?

Well, if not then this article will definitely help you in knowing about it in depth.

So taking the concept from scratch what is Chaos and why are we talking about it?

Let's suppose you create an application and deploy it on an online platform accessible to the public and all of a sudden your project/application gets recognized and you observe a lot of traffic over your application, people start using it as it might be helpful to them in some or the other way. Now you had no idea that something like this would happen and due to this you never took into account that whether you should scale your project or not.
As the traffic increases more and more each day, you observe that people are not able to access your application as some issues are being faced by a lot of people.

Now here comes the concept of scaling your application so that more and more people can access your application. So you now scaled your application to a large extent, still some issues were being faced that some microservices of your application were down and people were not able to update/make changes.

Here comes the concept of testing which is done while updating/scaling your application so that people can easily make use of it. We test our applications/system so that people in the near future do not face any difficulties while making use of your application.
Testing is done to ensure our system can withstand unexpected disruptions.

What we do is we inject Chaos in our applications to test for resiliency.
Chaos can also be explained as a technique that helps us to get to know that how our application would react when so and so changes are made in the system.

Resilience is the crucial concept of our testing as through this we can know that whether our application can stay afloat during downtimes.
One good example can be when there are sales on E-commerce websites be that for any occasion or literally any thing. Then there is huge amount of customers each day during the sale due to the low prices.

Chaos is used in four steps while testing your application-
Alt Text
Step 1:
We basically observe and take into account how your system is behaving in normal state or in the steady state

Step 2:
We make a hypothesis how our system would behave during Vulnerability/chaotic state and in steady state

Step 3:
This hypothesis further leads to experiments that help us know about what changes our system needs in chaotic state

Step 4:
And this cycle continues till our system starts getting adapted to the experiments provided.

Chaos Engineering also has some states that help us better understand the state of our application. These states are listed below-
Chaos cycle
STEADY STATE OF AN APPLICATION-
Identify the steady state that how your application behaves during normal state.

INTRODUCTION OF FAULTS-
Another state is the deliberate introduction of a fault. This is done so as to test that how would our application react when it would experience a downtime

STEADY STATE REGAINED OR NOT-
If we see no difference after introducing a fault then our application is in healthy state or resilient state

WEAKNESS FOUND-
If we do observe that the application isn't working or stopped working then that is called as weakness

FIXING & AGAIN PASS THE FAULT-
In this we try to fix the weakness or vulnerability and pass the faults again for testing of resiliency

RESILIENT-
This process continues till we observe that our application has achieved resiliency and is no longer in weak state even after introduction of multiple faults

These all the states help us better know about our system and how we should scale and test them so that it can be easily accessed by people.
Chaos Engineering is one of the most important and interesting concepts in the field of testing as through this once can inject chaos in application/system to test for Resiliency.

Here including some of the resources to get you started with Chaos if you found this article interesting:

1.Quick Start-Chaos Engineering

2.Chaos Community

3.CNCF Slack for #project-chaos-mesh channel

Thank you so much for reading this article! Hope you found it interesting. Have a Great day ahead! :)

Top comments (0)