Anthos Service Mesh, a managed Istio ⛵️
Istio is one of the most advanced pieces of software in the Kubernetes ecosystem. It allows to redefine the way our services are communicating with each other, without being invasive. Istio works by taking control over the network of your Kubernetes cluster and allows applying configurations (through YAML). If you want to discover Istio, I invite you to read the excellent documentation provided in istio.io
The main problem with Istio, is the complexity to manage and configure it. Like kubernetes, this system is complex, and errors could lead to downtime in your cluster… 😓. Lots of software-company starts to provide a pre-configured version of Istio, and here we will talk about Anthos Service Mesh.
Anthos Service Mesh is available on Anthos clusters running in Google Cloud, AWS or on-premise (different features are available depending on the cluster locality). Here, we will describe the Google Cloud version based on Anthos Service Mesh version 1.9.3.asm-2 (the last version may be different when you read this article).
NOTE: A fully managed version of Anthos Service Mesh exists but is actually in preview/beta. I prefer, for now, using the standard version of Anthos Service Mesh (aka Customer-managed control plane).
The installation is pretty straight-forward, Google provides a script
install_asm to automate the installation on an already existing GKE cluster:
./install_asm \ --project_id kevin-anthos-asm \ --cluster_name anthos-asm-demo \ --cluster_location europe-north1-a \ --mode install \ --output_dir ./asm-downloads \ --enable_all
NOTE: I am installing the latest version of ASM here, but you can choose a different one with the
--revision_name parameter if required.
To be executed, the script has some requirements. I invite you to check that here. The best solution is to use the Google Cloud Shell to do the installation, it fulfills requirements by default.
install_asm: Setting up necessary files... install_asm: Fetching/writing GCP credentials to kubeconfig file... install_asm: [WARNING]: nc not found, skipping k8s connection verification install_asm: [WARNING]: (Installation will continue normally.) install_asm: Checking installation tool dependencies... install_asm: Getting account information... install_asm: Confirming cluster information for kevin-anthos-asm/europe-north1-a/anthos-asm-demo... install_asm: Downloading ASM.. % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 41.7M 100 41.7M 0 0 31.5M 0 0:00:01 0:00:01 --:--:-- 31.5M install_asm: Downloading ASM kpt package... fetching package "/asm" from "https://github.com/GoogleCloudPlatform/anthos-service-mesh-packages" to "asm" install_asm: Confirming node pool requirements for kevin-anthos-asm/europe-north1-a/anthos-asm-demo... install_asm: Checking Istio installations... install_asm: Enabling required APIs... install_asm: Binding user:firstname.lastname@example.org to required IAM roles... install_asm: Checking for project kevin-anthos-asm... install_asm: Reading labels for europe-north1-a/anthos-asm-demo... install_asm: Adding labels to europe-north1-a/anthos-asm-demo... install_asm: Enabling Workload Identity on europe-north1-a/anthos-asm-demo... install_asm: (This could take awhile, up to 10 minutes) install_asm: Initializing meshconfig API... install_asm: Enabling Stackdriver on europe-north1-a/anthos-asm-demo... install_asm: Querying for core/account... install_asm: Binding email@example.com to cluster admin role... clusterrolebinding.rbac.authorization.k8s.io/kevin.davin-cluster-admin-binding created install_asm: Creating istio-system namespace... namespace/istio-system created install_asm: Configuring kpt package... asm/ set 22 field(s) of setter "gcloud.container.cluster" to value "anthos-asm-demo" asm/ set 40 field(s) of setter "gcloud.core.project" to value "kevin-anthos-asm" asm/ set 2 field(s) of setter "gcloud.project.projectNumber" to value "62405001080" asm/ set 6 field(s) of setter "gcloud.project.environProjectNumber" to value "62405001080" asm/ set 21 field(s) of setter "gcloud.compute.location" to value "europe-north1-a" asm/ set 2 field(s) of setter "gcloud.compute.network" to value "kevin-anthos-asm-default" asm/ set 6 field(s) of setter "anthos.servicemesh.rev" to value "asm-193-2" asm/ set 2 field(s) of setter "anthos.servicemesh.tag" to value "1.9.3-asm.2" install_asm: Installing validation webhook fix... service/istiod created install_asm: Installing ASM control plane... install_asm: ...done! install_asm: Installing ASM CanonicalService controller in asm-system namespace... namespace/asm-system created customresourcedefinition.apiextensions.k8s.io/canonicalservices.anthos.cloud.google.com created role.rbac.authorization.k8s.io/canonical-service-leader-election-role created clusterrole.rbac.authorization.k8s.io/canonical-service-manager-role created clusterrole.rbac.authorization.k8s.io/canonical-service-metrics-reader created serviceaccount/canonical-service-account created rolebinding.rbac.authorization.k8s.io/canonical-service-leader-election-rolebinding created clusterrolebinding.rbac.authorization.k8s.io/canonical-service-manager-rolebinding created clusterrolebinding.rbac.authorization.k8s.io/canonical-service-proxy-rolebinding created service/canonical-service-controller-manager-metrics-service created deployment.apps/canonical-service-controller-manager created install_asm: Waiting for deployment... deployment.apps/canonical-service-controller-manager condition met install_asm: ...done! install_asm: install_asm: ***************************** client version: 1.9.3-asm.2 control plane version: 1.9.3-asm.2 data plane version: 1.9.3-asm.2 (2 proxies) install_asm: ***************************** install_asm: The ASM control plane installation is now complete. install_asm: To enable automatic sidecar injection on a namespace, you can use the following command: install_asm: kubectl label namespace <NAMESPACE> istio-injection- istio.io/rev=asm-193-2 --overwrite install_asm: If you use 'istioctl install' afterwards to modify this installation, you will need install_asm: to specify the option '--set revision=asm-193-2' to target this control plane install_asm: instead of installing a new one. install_asm: To finish the installation, enable Istio sidecar injection and restart your workloads. install_asm: For more information, see: install_asm: https://cloud.google.com/service-mesh/docs/proxy-injection install_asm: The ASM package used for installation can be found at: install_asm: /home/kevin_davin/anthos/asm/2021-05-11-apres-midi/asm-downloads/asm install_asm: The version of istioctl that matches the installation can be found at: install_asm: /home/kevin_davin/anthos/asm/2021-05-11-apres-midi/asm-downloads/istio-1.9.3-asm.2/bin/istioctl install_asm: A symlink to the istioctl binary can be found at: install_asm: /home/kevin_davin/anthos/asm/2021-05-11-apres-midi/asm-downloads/istioctl install_asm: The combined configuration generated for installation can be found at: install_asm: /home/kevin_davin/anthos/asm/2021-05-11-apres-midi/asm-downloads/asm-193-2-manifest-raw.yaml install_asm: The full, expanded set of kubernetes resources can be found at: install_asm: /home/kevin_davin/anthos/asm/2021-05-11-apres-midi/asm-downloads/asm-193-2-manifest-expanded.yaml install_asm: ***************************** install_asm: Successfully installed ASM.
This script will install a custom version of Istio, named Antos Service Mesh. At the end, your cluster will have two new namespaces,
$ kubectl get ns NAME STATUS AGE asm-system Active 127m default Active 141m istio-system Active 128m kube-node-lease Active 141m kube-public Active 141m kube-system Active 141m
You have successfully installed Anthos Service Mesh on your GKE Cluster… we have to use it now!
Istio is a cluster-wide tool, which can be activated at a namespace level or at a component level (but less common). We have to add a label to our
namespace to trigger Istio functionalities on it:
$ kubectl create namespace workshop $ kubectl label namespace workshop istio.io/rev=asm-193-2 --overwrite
asm-193-2 is the version provided by the
install_asm command logs. With this information, ASM knows it will have to inject side-car container for every component of this namespace.
NOTE: If you want more details on the installation process, the official documentation is available here and provide a lot of information for various use cases.
Anthos Service Mesh is a branded version of Istio. Modifications provided by Google are pretty soft and here mainly to make the system compatible with the cloud console. In Customer-managed control plane, you have access to every functionality the vast majority of feature provided by Istio 1.9.
You can consult the complete list of features available here
The main advantage of ASM over the OpenSource version of Istio is its integration in the Google Cloud console.
For this example, I've deployed 3 applications in the
workshop namespace. Those applications came from our Stack Labs Workshop on Istio (accessible here, and fully open source). With this, we can use theirs dashboards provided.
We can have a global point-of-view on our micro-services deployed in our cluster. We have a tabular view, which can be filtered on namespace, providing a clear view of our services status and performance.
A global topology view (still in beta 🧪) allowing us to drill down on our services, components, deployments, pods… very useful if we want to understand the communication schema in our cluster.
If we want to have deeper understanding on each component, we have access to a service specific view, accessible by clicking on a service on the tabular view.
This dashboard will provide a specific point of view on the behaviour of your service. Here, we will focus on the middleware service. For this example, we configure this application to send back 500 errors 50% of the time.
This main view is here to summarize every following dashboards into one. It is an entry point for our service
The Health view is here to present us the Service Level Objectives (SLO) based on Service Level Indicators (SLI) we defined on our service.
NOTE: If you want to learn more about SLOs and SLIs, this blog post summarizes it (a lot) the idea behind it. You can also read the documentation and books provide freely by Google SRE team here.
We can define SLOs and SLIs on our service with multiple information gathered by the cloud console and Anthos Service Mesh for us.
You can define an alerting strategy for each SLO. Multiple systems are available from Slack, Pager-Duty, Email for the most standard to web-hook, cloud function or any other programmating system. I choose email for this example and I receive this after few minutes, because the SLO defined was not reached anymore.
We have access to a global metrics view of the service. CPU, RAM, requests by seconds… all the required information to be able to follow the health of the application.
We can analyse all the connectivity of our service. Here, we have a complete list of every services connecting to our service (inbound) or services reached by our service (outbound).
The infrastructure pan allows us to see every instance of our service over time. Each of them has its own metrics (CPU, RAM, error rate…) and so, we can analyse at a fine grain level the performance and behaviour of our system.
Security pane allows analysing communication security level between services. Istio and ASM provides a built-in way to communicate with mTLS between components. Here, you will see if exchanges are made "in clear" or with a secure protocol. I didn't configure anything and communications between frontend, middleware and database are secured by default.
Finally, a useful view is available to consult YAML resources deployed in the cluster corresponding to this service. Deployment, VirtualService, DestinationRoute… All resources are available from the web-ui, simplifying analysis again.
When you analyze production, especially when a problem occurs, you can't only use current data, you have to compare data with past data of the same system to elaborate a conclusion. Here, in every dashboard I introduce to you, you have the capacity to activate a timeline and narrow down your observations to a specific period of time.
Every table, graph, metrics will be adapted to the given time span to present you the information at a specific moment. This will be convenient when you will want to compare the behavior of a service before and after an upgrade, for example.
This feature is really awesome because you won't have to configure a dashboard for each needs. The system provided to you is made for operator, no need to customize a PromQL or Grafana query to analyse what's currently happening in production…
I focused this article mainly on Observability, because Anthos Service Mesh provides it out of the box. Even if Istio has wonderful features for traffic splitting, mirroring, authorization… the first reason you will want to use it for is Observability 🕵️♂️.
Google is now doing with Istio what it did with Kubernetes many years ago. It integrated it and simplify its usage to make it available for everyone with ease. The future version, with a Google Managed Control Plane should simplify it even more.
If you want to increase your observability with a managed and preconfigured system, I advise you to test Anthos Service Mesh!
Top comments (0)