In today's complex digital landscape, application resilience is crucial. Chaos engineering, using tools like LitmusChaos, intentionally introduces faults to uncover systemic issues missed by traditional tests. From now on, we can use k6, a load-testing tool with LitmusChaos. This post explores chaos engineering with LitmusChaos and demonstrates a k6 load chaos experiment.
Table of Contents
- What are chaos engineering and LitmusChaos
- Introduction to k6
- Injecting k6 load chaos with LitmusChaos (demo)
What are chaos engineering and LitmusChaos
What if our systems suddenly experience an outage? It's difficult to pinpoint the problem these days, especially since our systems are on Kubernetes, meaning they are microservices. While unit tests and integration tests can detect our application's weaknesses, they cannot detect weaknesses in our overall platform.
The above diagram shows the impact of resilience. Using chaos engineering can be a great way to achieve more than 90% resilience. Chaos Engineering is the discipline of experimenting on a system in order to build confidence in the system’s capability to withstand turbulent conditions in production [1]. Chaos engineering involves intentionally injecting faults into a system to test its resilience. LitmusChaos makes this process easier to implement by simplifying Chaos Engineering.
LitmusChaos (CNCF incubating project) is a Cloud-Native Chaos Engineering Framework with cross-cloud support. You can use Litmus to inject controlled chaos and run chaos experiments in staging and production environments, allowing SREs to identify bugs and vulnerabilities. If you want to know more about Litmuschaos, Check out the documentation!
Introduction to k6
k6 is an open source load testing tool designed for developers to allow teams to create tests-as-code, integrate performance tests as part of the software development lifecycle, and help users test, analyze, and fix application performance issues. k6 supports various types of load testing (ex. spike, smoke, stress).
Now LitmusChaos supports k6 load testing as a chaos fault so that we can simulate load generation to the target application as a part of chaos testing on Kubernetes.
To know more about it, check out this documentation. You can also find our integration in the k6 documentation 🚀
Injecting k6 load chaos with LitmusChaos (demo)
Let us run the k6-loadgen chaos experiment. For simplicity, we will be injecting chaos into an OpenTelemetry Demo.
Prerequisites
- Docker, minikube (If you use your k8s cluster, skip this)
- Check out the otel-demo's Prerequisites
- If you haven't installed LitmusChaos yet, check out this documentation.
Install opentelemetry demo
After installing Minikube and LitmusChaos, We now install the opentelemetry demo. All you have to do is enter the code below.
helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
helm install my-otel-demo open-telemetry/opentelemetry-demo
If you want to customized a deployment setting, check out this documentation.
We cannot directly access the otel-demo service, so we are using minikube's command to get an external URL
minikube service my-otel-demo-frontendproxy --url
the result is like one below
We now access the frontend service
You can access Grafana using given URL
http://<<given_url>>/grafana
We will use Home > Dashboards > Demo Dashboard
today.
Setup Probes
You can easily create a probe following this documentation. Enter a value like below.
// URL: http://my-otel-demo-frontendproxy.default.svc.cluster.local:8080
// METHOD: GET
// CRITERIA: ==
// Response Code: 200
Writing a k6 script
You don't have to install a k6 engine! We just have to write a k6 script. We will use the below code and save it as script.js
import http from 'k6/http';
import { sleep } from 'k6';
export const options = {
vus: 1000,
duration: '30s',
};
export default function () {
http.get('http://my-otel-demo-frontendproxy.default.svc.cluster.local:8080');
sleep(1);
}
If you'd like to write a more professional script, you can check out this documentation.
and let's make a secret
kubectl create secret generic k6-script \
--from-file=<<script-path>>/script.js -n <<chaos_infrastructure_namespace>>
Now that all the preliminary work is done, let's create a chaos experiment.
Let's make a chaos experiment
Click Chaos Experiments
> + New Experiment
to create a new chaos experiment.
After clicking the Blank Canvas
and Add
buttons, we can choose chaos fault in ChaosHub. We use k6-loadgen
In the Tune Fault
tab, we enter the secret name and key we created before.
Lastly, We add a probe that was created before.
Okay, all done! Let's execute a chaos experiment 🚀
A few minutes later, Our chaos experiment succeeded. Congratulations 🎉
We can see the load testing result on the Grafana dashboard. Go to Dashboards
> Demo Dashboard
> Requests Rate for frontend by span name
> frontend-proxy
. We can see the result below.
Summary
k6-loadgen fault simulates load generation on the target hosts for a specific chaos duration. The effects of chaos engineering can be maximized by designing experiments with k6-loadgen like other chaos faults in LitmusChaos.
If you are interested in LitmusChaos, Join the community! You can join the LitmusChaos community on GitHub and Slack.
Thank you for reading 🙏
Namkyu Park
Maintainer of LitmusChaos
LinkedIn | GitHub
Top comments (4)
Hi, thanks for sharing!
However, I have a problem with the Grafana dashboard... nothing is displayed. I think it is related to the Target Application section in which you said "we will fix it later", but no fix was written :)
Hi Kreeve, Target application(demo) is created by the open telemetry community, including the dashboard. So, can you please check all the open telemetry demo components again? This includes Grafana as well. I recommend proceeding with our tutorial once the demos are up and running.
Hi Namkyu! I probably misspoke. All my pods are up and running and I can reach both the demo app and Grafana. I have the problem after running the experiment in Litmus (which ends successfully) but the Grafana “Requests Rate for frontend by span name” panel shows no data. I think the problem is in the “Target Application” tab of the k6 configuration (in Litmus). I assume so since it was said in the tutorial “we will fix it later,” but there was no further mention of that tab. Of course this is just an indication, I'm not sure I've found the problem.
Thank you :)
Hi Kreeve, The "Target Application" tab does not affect our experiments. I need to check some things, so can you come to our Slack channel #litmus and send a message to me? I can help you :)
kubernetes.slack.com/?redir=%2Farc...