This is a guide where you will get hands-on experience with GitOps and Progressive Delivery using Kubernetes and Istio.
What is GitOps?
GitOps is a way to do Continuous Delivery, it works by using Git as a source of truth for declarative infrastructure and workloads. For Kubernetes this means using
git push instead of
kubectl apply/delete or
In this workshop you'll be using GitHub to host the config repository and Flux as the GitOps delivery solution.
Flux is a multi-tenant Continuous Delivery solution for Kubernetes. Flux is constructed with the GitOps Toolkit, a set of composable APIs and specialized tools for keeping Kubernetes clusters in sync with sources of configuration (like Git & Helm repositories), and automating updates to configuration when there is new code to deploy.
Who is Flux for?
Flux is made for:
- cluster operators who automate provision and configuration of clusters
- platform engineers who build continuous delivery for developer teams
- app developers who rely on continuous delivery to get their code live
What is Progressive Delivery?
Progressive Delivery is an umbrella term for advanced deployment patterns like canaries, feature flags and A/B testing. Progressive delivery techniques are used to reduce the risk of introducing a new software version in production
by giving app developers and SRE teams a fine-grained control over the blast radius.
In this workshop you'll be using Flagger to drive Canary Releases and A/B Testing for your applications.
Flagger automates the release process using Istio routing for traffic shifting and Prometheus metrics for canary analysis. It comes with a declarative model for decoupling the deployment of apps from the release process.
You'll need a Kubernetes cluster v1.16 or newer with
For testing purposes you can use Minikube with 2 CPUs and 4GB of memory.
flux CLI with Homebrew:
brew install fluxcd/tap/flux
Binaries for macOS AMD64/ARM64, Linux AMD64/ARM and Windows are available to download on the flux2 release page.
Verify that your cluster satisfies the prerequisites with:
flux check --pre
yq with Homebrew:
brew install jq yq
Fork the gitops-istio repository and clone it:
git clone https://github.com/<YOUR-USERNAME>/gitops-istio cd gitops-istio
flux bootstrap command you can install Flux on a Kubernetes cluster and configure it to manage itself from a Git repository. If the Flux components are present on the cluster, the bootstrap command will perform an upgrade if needed.
Bootstrap Flux by specifying your GitHub repository fork URL:
flux bootstrap git \ --author-email=<YOUR-EMAIL> \ --url=ssh://firstname.lastname@example.org/<YOUR-USERNAME>/gitops-istio \ --branch=main \ --path=clusters/my-cluster
The above command requires ssh-agent, if you're using Windows see flux boostrap github documentation.
At bootstrap, Flux generates an SSH key and prints the public key. In order to sync your cluster state with git you need to copy the public key and create a deploy key with write access on your GitHub repository. On GitHub go to Settings > Deploy keys click on Add deploy key, check Allow write access, paste the Flux public key and click Add key.
When Flux has access to your repository it will do the following:
- installs the Istio operator
- waits for Istio control plane to be ready
- installs Flagger, Prometheus and Grafana
- creates the Istio public gateway
- creates the
- installs the demo apps
When bootstrapping a cluster with Istio, it's important to define the install order. For the application pods to be injected with Istio sidecar, the Istio control plane must be up and running before the apps.
With Flux v2 you can specify the execution order by defining dependencies between objects. For example, in clusters/my-cluster/apps.yaml we tell Flux that the
apps reconciliation depends on the
apiVersion: kustomize.toolkit.fluxcd.io/v1beta1 kind: Kustomization metadata: name: apps namespace: flux-system spec: interval: 30m0s dependsOn: - name: istio-system sourceRef: kind: GitRepository name: flux-system path: ./apps
Watch Flux installing Istio first, then the demo apps:
watch flux get kustomizations
You can tail the Flux reconciliation logs with:
flux logs --all-namespaces --follow --tail=10
You can customize the Istio installation with the
IstioOperator resource located at istio/system/profile.yaml:
apiVersion: install.istio.io/v1alpha1 kind: IstioOperator metadata: name: istio-default namespace: istio-system spec: profile: demo components: pilot: k8s: resources: requests: cpu: 10m memory: 100Mi
After modifying the Istio settings, you can push the change to git and Flux will apply it on the cluster. The Istio operator will reconfigure the Istio control plane according to your changes.
When a new Istio version is available, the
update-istio GitHub workflow will open a pull request with the manifest updates needed for upgrading Istio Operator. The new Istio version is tested on Kubernetes Kind by the
e2e workflow and when the PR is merged into the main branch, Flux will upgrade Istio on your cluster.
When Flux syncs the Git repository with your cluster, it creates the frontend/backend deployment, HPA and a canary object. Flagger uses the canary definition to create a series of objects: Kubernetes deployments, ClusterIP services, Istio destination rules and virtual services. These objects expose the application on the mesh and drive the canary analysis and promotion.
# applied by Flux deployment.apps/frontend horizontalpodautoscaler.autoscaling/frontend canary.flagger.app/frontend # generated by Flagger deployment.apps/frontend-primary horizontalpodautoscaler.autoscaling/frontend-primary service/frontend service/frontend-canary service/frontend-primary destinationrule.networking.istio.io/frontend-canary destinationrule.networking.istio.io/frontend-primary virtualservice.networking.istio.io/frontend
Check if Flagger has successfully initialized the canaries:
kubectl -n prod get canaries NAME STATUS WEIGHT backend Initialized 0 frontend Initialized 0
frontend-primary deployment comes online,
Flagger will route all traffic to the primary pods and scale to zero the
Find the Istio ingress gateway address with:
kubectl -n istio-system get svc istio-ingressgateway -ojson | jq .status.loadBalancer.ingress
Open a browser and navigate to the ingress address, you'll see the frontend UI.
Flagger implements a control loop that gradually shifts traffic to the canary while measuring key performance indicators like HTTP requests success rate, requests average duration and pod health. Based on analysis of the KPIs a canary is promoted or aborted, and the analysis result is published to Slack.
A canary analysis is triggered by changes in any of the following objects:
- Deployment PodSpec (container image, command, ports, env, etc)
- ConfigMaps and Secrets mounted as volumes or mapped to environment variables
For workloads that are not receiving constant traffic Flagger can be configured with a webhook, that when called, will start a load test for the target workload. The canary configuration can be found at apps/backend/canary.yaml.
Pull the changes from GitHub:
git pull origin main
To trigger a canary deployment for the backend app, bump the container image:
yq e '.images.newTag="5.0.1"' -i ./apps/backend/kustomization.yaml
Commit and push changes:
git add -A && \ git commit -m "backend 5.0.1" && \ git push origin main
Note that Flux can update the container image tag in an automated manner, for more details see Automate image updates to Git with Flux v2.
Tell Flux to pull the changes or wait one minute for Flux to detect the changes on its own:
flux reconcile source git flux-system
Watch Flux reconciling your cluster to the latest commit:
watch flux get kustomizations
After a couple of seconds, Flagger detects that the deployment revision changed and starts a new rollout:
$ kubectl -n prod describe canary backend Events: New revision detected! Scaling up backend.prod Starting canary analysis for backend.prod Pre-rollout check conformance-test passed Advance backend.prod canary weight 5 ... Advance backend.prod canary weight 50 Copying backend.prod template spec to backend-primary.prod Promotion completed! Scaling down backend.prod
During the analysis the canary’s progress can be monitored with Grafana. You can access the dashboard using port forwarding:
kubectl -n istio-system port-forward svc/flagger-grafana 3000:80
The Istio dashboard URL is
Note that if new changes are applied to the deployment during the canary analysis, Flagger will restart the analysis phase.
Besides weighted routing, Flagger can be configured to route traffic to the canary based on HTTP match conditions.
In an A/B testing scenario, you'll be using HTTP headers or cookies to target a certain segment of your users.
This is particularly useful for frontend applications that require session affinity.
You can enable A/B testing by specifying the HTTP match conditions and the number of iterations:
analysis: # schedule interval (default 60s) interval: 10s # max number of failed metric checks before rollback threshold: 10 # total number of iterations iterations: 12 # canary match condition match: - headers: user-agent: regex: ".*Firefox.*" - headers: cookie: regex: "^(.*?;)?(type=insider)(;.*)?$"
The above configuration will run an analysis for two minutes targeting Firefox users and those that have an insider cookie. The frontend configuration can be found at apps/frontend/canary.yaml.
Trigger a deployment by updating the frontend container image:
yq e '.images.newTag="5.0.1"' -i ./apps/frontend/kustomization.yaml git add -A && \ git commit -m "frontend 5.0.1" && \ git push origin main flux reconcile source git flux-system
Flagger detects that the deployment revision changed and starts the A/B testing:
$ kubectl -n istio-system logs deploy/flagger -f | jq .msg New revision detected! Scaling up frontend.prod Waiting for frontend.prod rollout to finish: 0 of 1 updated replicas are available Pre-rollout check conformance-test passed Advance frontend.prod canary iteration 1/10 ... Advance frontend.prod canary iteration 10/10 Copying frontend.prod template spec to frontend-primary.prod Waiting for frontend-primary.prod rollout to finish: 1 of 2 updated replicas are available Promotion completed! Scaling down frontend.prod
You can monitor all canaries with:
$ watch kubectl get canaries --all-namespaces NAMESPACE NAME STATUS WEIGHT prod frontend Progressing 100 prod backend Succeeded 0
Flagger makes use of the metrics provided by Istio telemetry to validate the canary workload. The frontend app analysis defines two metric checks:
metrics: - name: error-rate templateRef: name: error-rate namespace: istio-system thresholdRange: max: 1 interval: 30s - name: latency templateRef: name: latency namespace: istio-system thresholdRange: max: 500 interval: 30s
The Prometheus queries used for checking the error rate and latency are located at flagger-metrics.yaml.
During the canary analysis you can generate HTTP 500 errors and high latency to test Flagger's rollback.
Generate HTTP 500 errors:
watch curl -b 'type=insider' http://<INGRESS-IP>/status/500
watch curl -b 'type=insider' http://<INGRESS-IP>/delay/1
When the number of failed checks reaches the canary analysis threshold, the traffic is routed back to the primary, the canary is scaled to zero and the rollout is marked as failed.
$ kubectl -n istio-system logs deploy/flagger -f | jq .msg New revision detected! Scaling up frontend.prod Pre-rollout check conformance-test passed Advance frontend.prod canary iteration 1/10 Halt frontend.prod advancement error-rate 31 > 1 Halt frontend.prod advancement latency 2000 > 500 ... Rolling back frontend.prod failed checks threshold reached 10 Canary failed! Scaling down frontend.prod
For configuring alerting of the canary analysis for Slack, MS Teams, Discord or Rocket see the docs.
If you have any questions about progressive delivery:
- Invite yourself to the CNCF community slack and join the #flux and #flagger channels.
- Check out the Flux talks section and to see a list of online talks, hands-on training and meetups.
Your feedback is always welcome!