Lukas Gentele for Loft Labs, Inc.

Posted on Jul 18, 2022 • Originally published at loft.sh

Development Environments with vcluster

#kubernetes #vcluster #devspace #loft

By Antonio Berben

At Solo.io, we listen to the community and try out the best technologies to help teams meet their goals. This includes both working on open source projects, as well as providing support and products that can help you better leverage technologies.

Gloo Mesh is one of those products. It provides a good example of how to reduce the complexity of managing the entire application networking in your infrastructure to a minimum. As can be understood, this implies multi-cluster architectures.

In such scenarios, how can you verify that a multi-cluster configuration is correct in a local environment before moving to a more extensive environment?

Let’s put it in context. Your team (or the development team) wants to release a new feature. They want to cause some chaos in the system. Gloo Mesh offers this functionality and many other through policies (FailOver, fault injection, outlier detection, retries, timeouts, mirroring, rate limiting, and more). But you, as an operator of the platform and Gloo Mesh, may not be sure which is the correct configuration. You need to investigate first in a development or testing environment.

In a simulated production scenario that uses three clusters (one for management and two for workloads), the first concern is obvious: Cost. Deploying three clusters in public clouds is expensive.

The second concern: networking. Let’s say you decide to investigate first in your local environment. Deploying three entire clusters in your own workstation is not easy. You can opt for solutions like multiple kind (kubernetes-in-docker) or k3d. Both deploy clusters in containers on top of the host machine. One cluster, one container. If you try one of these approaches, you probably have to tweak the network between the containers and the host machine.

The third concern: CPU. To deploy things in your own local environment, you need to make sure you have enough “muscle”.

Now… What if we start considering “A cluster within a cluster”?

vcluster

I hope you saw the iconic movie Inception. I enjoyed it a lot and I watch it again from time to time. The idea was pretty catchy: “A dream within a dream”.

Virtualization technology follows the same idea. If you are familiar with Docker, years ago there was the need for docker-in-docker. Nowadays it is a very common approach in CI/CD pipelines. Say for example that tasks are running in a container but you need to test an application already embedded in another container. This would be a use case of docker-in-docker.

Given that idea, what stops us from trying cluster-in-cluster? This is where vcluster comes in to offer some benefits. vcluster allows you to create and manage virtual Kubernetes clusters. A virtual cluster is basically a control plane that runs in a namespace on a shared host custer. Here a visualization:

In the picture we can see that Gloo Mesh, which before required three clusters to simulate a production-ready environment, now just needs one cluster with three virtual clusters.

Quick benefits:

Cost effective: Now, your cost is only one cluster. It is true that it needs to be bigger than before, but you’re saving money by deploying one cluster instead of three.
Time-saving: when you work in your local environment, you do not want to spend time creating new clusters. If you use kind, it can take several minutes to get three new clusters. With vcluster, you can get your three new clusters in about 20 seconds.

Let’s prove all this in a workshop.

Hands on!

In this workshop, in a matter of seconds, you will deploy Istio in the two workload clusters, a demo application to use in your labs, and Gloo Mesh to test the application networking capabilities (multi-cluster traffic, traffic splitting, fault injection, etc.). All this is based on just one host Kubernetes cluster containing three virtual clusters.

Your architecture will look like this:

Prerequisites

A Kubernetes cluster which will be the host cluster (kind, k3s, k0s, etc.)
vcluster CLI. This has been tested with version 0.10.2
Helm v3
Kubectl
meshctl

Getting Started

Let’s check on how long it takes you to deploy everything. The test was made using a virtual machine with only three CPUs. Therefore, you will also deploy components with minimum resources.

You start with setting up some environment variables:

# Context name for the host cluster
export MAIN_CONTEXT=$(kubectl config current-context)

# Context names for the gloo mesh clusters (vclusters)
export MGMT_CLUSTER=devmgmt
export CLUSTER_1=devcluster1
export CLUSTER_2=devcluster2

Install environments

First, let’s create management cluster:

cat << EOF > vcluster-values.yaml
isolation:
  enabled: false
  limitRange:
    enabled: false
  podSecurityStandard: privileged
  resourceQuota:
    enabled: false
rbac:
  clusterRole:
    create: true
syncer:
  resources:
    limits:
      cpu: 100m
      memory: 1Gi
    requests:
      cpu: 100m
      memory: 128Mi
  extraArgs:
  - --fake-nodes=false
  - --sync-all-nodes
vcluster:
  resources:
    limits:
      cpu: 200m
      memory: 2Gi
    requests:
      cpu: 100m
      memory: 256Mi
  extraArgs:
  - --kubelet-arg=allowed-unsafe-sysctls=net.ipv4.*
  - --kube-apiserver-arg=feature-gates=EphemeralContainers=true
  - --kube-scheduler-arg=feature-gates=EphemeralContainers=true
  - --kubelet-arg=feature-gates=EphemeralContainers=true
  image: rancher/k3s:v1.22.5-k3s1
EOF


vcluster create $MGMT_CLUSTER -n $MGMT_CLUSTER --upgrade --connect=false --expose -f vcluster-values.yaml --context $MAIN_CONTEXT

vcluster connect $MGMT_CLUSTER -n $MGMT_CLUSTER --kube-config-context-name $MGMT_CLUSTER --update-current --context $MAIN_CONTEXT

kubectl --context $MGMT_CLUSTER get namespaces

Next, the workload cluster 1:

vcluster create $CLUSTER_1 -n $CLUSTER_1 --upgrade --connect=false --expose -f vcluster-values.yaml --context $MAIN_CONTEXT

vcluster connect $CLUSTER_1 -n $CLUSTER_1 --kube-config-context-name $CLUSTER_1 --update-current --context $MAIN_CONTEXT

kubectl --context $CLUSTER_1 get namespaces

And finally, the workload cluster 2:

vcluster create $CLUSTER_2 -n $CLUSTER_2 --upgrade --connect=false --expose -f vcluster-values.yaml --context $MAIN_CONTEXT

vcluster connect $CLUSTER_2 -n $CLUSTER_2 --kube-config-context-name $CLUSTER_2 --update-current --context $MAIN_CONTEXT

kubectl --context $CLUSTER_2 get namespaces

This is it! Three clusters in around 20 seconds. If you’re interested to know more, at the end of this post, you can find a more in-depth explanation of what you have deployed with vcluster and some tips to remember.

Now, time for Istio to be deployed in the workload clusters:

Install Gloo Mesh

You will need a license key:

export GLOO_MESH_LICENSE_KEY=<license_key>

And you need to define the Gloo Mesh version:

export GLOO_MESH_VERSION=2.0.9

Gloo Mesh can be installed through Helm charts. However, to not overflow this post with code, you will use the meshctl CLI:

meshctl install --kubecontext $MGMT_CLUSTER --license $GLOO_MESH_LICENSE_KEY --version $GLOO_MESH_VERSION

Verify all pods are running:

kubectl get pods -n gloo-mesh --context $MGMT_CLUSTER

And you will see something like:

NAME                                     READY   STATUS    RESTARTS   AGE
gloo-mesh-mgmt-server-778d45c7b5-5d9nh   1/1     Running   0          41s
gloo-mesh-redis-844dc4f9-jnb4j           1/1     Running   0          41s
gloo-mesh-ui-749dc7875c-4z77k            3/3     Running   0          41s
prometheus-server-86854b778-r6r52        2/2     Running   0          41s

Register workload clusters

Gloo Mesh relies on an agent-based approach. Therefore, when registering a workload cluster, you will need to tell the agent how to communicate with the management server.

Note that in EKS the service does not return an IP, but an Address. Please make that adjustment the following commands if you're using EKS.

MGMT_SERVER_NETWORKING_DOMAIN=$(kubectl get svc -n gloo-mesh gloo-mesh-mgmt-server --context $MGMT_CLUSTER -o jsonpath='{.status.loadBalancer.ingress[0].ip}')

MGMT_SERVER_NETWORKING_PORT=$(kubectl -n gloo-mesh get service gloo-mesh-mgmt-server --context $MGMT_CLUSTER -o jsonpath='{.spec.ports[?(@.name=="grpc")].port}')


MGMT_SERVER_NETWORKING_ADDRESS=${MGMT_SERVER_NETWORKING_DOMAIN}:${MGMT_SERVER_NETWORKING_PORT}
echo $MGMT_SERVER_NETWORKING_ADDRESS

meshctl cluster register \
  --remote-context=$CLUSTER_1 \
  --relay-server-address $MGMT_SERVER_NETWORKING_ADDRESS \
  --kubecontext $MGMT_CLUSTER \
  $CLUSTER_1

And you will see:

Registering cluster
📃 Copying root CA relay-root-tls-secret.gloo-mesh to remote cluster from management cluster
📃 Copying bootstrap token relay-identity-token-secret.gloo-mesh to remote cluster from management cluster
💻 Installing relay agent in the remote cluster
Finished installing chart 'gloo-mesh-agent' as release gloo-mesh:gloo-mesh-agent
📃 Creating remote.cluster KubernetesCluster CRD in management cluster
⌚ Waiting for relay agent to have a client certificate
         Checking...
         Checking...
🗑 Removing bootstrap token
✅ Done registering cluster!

meshctl cluster register \
  --remote-context=$CLUSTER_2 \
  --relay-server-address $MGMT_SERVER_NETWORKING_ADDRESS \
  --kubecontext $MGMT_CLUSTER \
  $CLUSTER_2

#### Check that the resource is created in management:

Bash
kubectl get kubernetescluster -n gloo-mesh --context $MGMT_CLUSTER


And you will see:

Bash
NAME AGE
devcluster1 27s
devcluster2 23s


#### Install Istio

Istio by default requires some resources. In your local environment, you might not have the resources to deploy three clusters fully functional and two Istio service meshes. Therefore, we need to reduce the required resources for Istio. That’s fine as this is just a development environment.

NOTE: This post is using Istio v1.12.6:

Bash
export ISTIO_VERSION=1.12.6


Install Istio’s CRDs:

Bash

Install Istio CRDS cluster1

helm upgrade --install istio-base istio/base \
-n istio-system \
--version $ISTIO_VERSION \
--kube-context $CLUSTER_1 \
--create-namespace

Install Istio CRDS cluster2

helm upgrade --install istio-base istio/base \
-n istio-system \
--version $ISTIO_VERSION \
--kube-context $CLUSTER_2\
--create-namespace


Install Istiod:

Bash
cat << EOF > istiod-common-values.yaml
meshConfig:
accessLogFile: /dev/stdout
defaultConfig:
holdApplicationUntilProxyStarts: true
envoyMetricsService:
address: gloo-mesh-agent.gloo-mesh:9977
envoyAccessLogService:
address: gloo-mesh-agent.gloo-mesh:9977
proxyMetadata:
ISTIO_META_DNS_CAPTURE: "true"
ISTIO_META_DNS_AUTO_ALLOCATE: "true"
pilot:
autoscaleEnabled: false
replicaCount: 1
env:
PILOT_SKIP_VALIDATE_TRUST_DOMAIN: "true"
resources:
requests:
cpu: 10m
memory: 2048Mi
limits:
cpu: 10m
memory: 2048Mi
EOF

Install istiod cluster1

helm upgrade --install istiod istio/istiod \
-f istiod-common-values.yaml \
--set global.meshID=mesh1 \
--set global.multiCluster.clusterName=$CLUSTER_1 \
--set meshConfig.trustDomain=$CLUSTER_1 \
--set meshConfig.defaultConfig.proxyMetadata.GLOO_MESH_CLUSTER_NAME=$CLUSTER_1 \
--namespace istio-system \
--version $ISTIO_VERSION \
--kube-context $CLUSTER_1

Install istiod cluster2

helm upgrade --install istiod istio/istiod \
-f istiod-common-values.yaml \
--set global.meshID=mesh1 \
--set global.multiCluster.clusterName=$CLUSTER_2 \
--set meshConfig.trustDomain=$CLUSTER_2 \
--set meshConfig.defaultConfig.proxyMetadata.GLOO_MESH_CLUSTER_NAME=$CLUSTER_2 \
--namespace istio-system \
--version $ISTIO_VERSION \
--kube-context $CLUSTER_2


Install ingress gateways:

Bash
cat << EOF > istio-ingress-common-values.yaml
replicaCount: 1
autoscaling:
enabled: false
name: istio-ingressgateway
securityContext: # runAsRoot
runAsUser: 1337
runAsGroup: 1337
runAsNonRoot: true
fsGroup: 1337
labels:
istio: ingressgateway
service:
type: LoadBalancer
ports:

port: 80 targetPort: 8080 name: http2
port: 443 targetPort: 8443 name: https resources: limits: cpu: 10m memory: 128Mi requests: cpu: 10m memory: 128Mi EOF

Install Istio Ingress Gateway Cluster 1

helm upgrade --install istio-ingressgateway istio/gateway \
-f istio-ingress-common-values.yaml \
--namespace istio-gateways \
--version $ISTIO_VERSION \
--kube-context $CLUSTER_1 \
--create-namespace

Install Istio Ingress Gateway Cluster 2


Install east-west gateways:

Bash
cat << EOF > istio-eastwest-common-values.yaml
replicaCount: 1
autoscaling:
enabled: false
name: istio-eastwestgateway
securityContext: # runAsRoot
runAsUser: 1337
runAsGroup: 1337
runAsNonRoot: true
fsGroup: 1337
labels:
istio: eastwestgateway
service:
type: LoadBalancer
ports:

name: tcp-status-port port: 15021 targetPort: 15021
name: tls port: 15443 targetPort: 15443 resources: requests: cpu: 10m memory: 128Mi limits: cpu: 10m memory: 128Mi EOF

Install Istio Eastwest Gateway Cluster 1

helm upgrade --install istio-eastwestgateway istio/gateway \
-f istio-eastwest-common-values.yaml \
--namespace istio-gateways \
--version $ISTIO_VERSION \
--kube-context $CLUSTER_1

Install Istio Eastwest Gateway Cluster 2

helm upgrade --install istio-eastwestgateway istio/gateway \
-f istio-eastwest-common-values.yaml \
--namespace istio-gateways \
--version $ISTIO_VERSION \
--kube-context $CLUSTER_2


#### Deploy Applications

In workload cluster 1:

Bash
kubectl --context ${CLUSTER_1} create ns bookinfo
export bookinfo_yaml=https://raw.githubusercontent.com/istio/istio/1.11.4/samples/bookinfo/platform/kube/bookinfo.yaml
kubectl --context ${CLUSTER_1} label namespace bookinfo istio-injection=enabled

kubectl --context ${CLUSTER_1} apply -f ${bookinfo_yaml} -l 'app,version notin (v3)' -n bookinfo

kubectl --context ${CLUSTER_1} apply -f ${bookinfo_yaml} -l 'account' -n bookinfo


And in workload cluster 2:

Bash
kubectl --context ${CLUSTER_2} create ns bookinfo

kubectl --context ${CLUSTER_2} label namespace bookinfo istio-injection=enabled

kubectl --context ${CLUSTER_2} apply -f ${bookinfo_yaml} -n bookinfo


Define your workspace (this is an abstraction given by Gloo Mesh to facilitate the organization of the workloads regardless the physical location):

Bash
kubectl apply --context $MGMT_CLUSTER -n gloo-mesh -f- <<EOF
apiVersion: admin.gloo.solo.io/v2
kind: Workspace
metadata:
name: developers
namespace: gloo-mesh
spec:
workloadClusters:

name: '*' namespaces:
- name: '*' EOF

kubectl apply --context $CLUSTER_1 -n gloo-mesh -f- <<EOF
apiVersion: admin.gloo.solo.io/v2
kind: WorkspaceSettings
metadata:
name: developers
namespace: gloo-mesh
spec:
options:
serviceIsolation:
enabled: false
federation:
enabled: false
EOF


Expose the application:

Bash
kubectl --context ${CLUSTER_1} apply -f - <<EOF
apiVersion: networking.gloo.solo.io/v2
kind: VirtualGateway
metadata:
name: north-south-gw
namespace: istio-gateways
spec:
workloads:
- selector:
labels:
istio: ingressgateway
cluster: ${CLUSTER_1}
listeners:
- http: {}
port:
number: 80
allowedRouteTables:
- host: '*'
EOF

kubectl --context ${CLUSTER_1} apply -f - <<EOF
apiVersion: networking.gloo.solo.io/v2
kind: RouteTable
metadata:
name: productpage
namespace: bookinfo
labels:
expose: "true"
spec:
hosts:
- '*'
virtualGateways:
- name: north-south-gw
namespace: istio-gateways
cluster: ${CLUSTER_1}
workloadSelectors: []
http:
- name: productpage
matchers:
- uri:
prefix: /
forwardTo:
destinations:
- ref:
name: productpage
namespace: bookinfo
port:
number: 9080
EOF


#### Verify the Environment

Next, let’s create a bit of traffic and see what the UI displays. For that, port-forward the Gloo Mesh UI component:

Bash
export ENDPOINT_HTTP_GW_CLUSTER1=$(kubectl --context ${CLUSTER_1} -n istio-gateways get svc istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].*}'):80

for i in {0..100}; do curl -s -o /dev/null -w "%{http_code}
" $ENDPOINT_HTTP_GW_CLUSTER1/productpage; done


You should see:

Bash
❯ for i in {0..100}; do curl -s -o /dev/null -w "%{http_code}
" $ENDPOINT_HTTP_GW_CLUSTER1/productpage; done
200
200
200
200
200
200


Now, let port-forward the UI:

Bash
kubectl --context $MGMT_CLUSTER port-forward svc/gloo-mesh-ui -n gloo-mesh 8090


Go to: <http://localhost:8090/> and you will see all the information about your clusters and your workspaces.

![Gloo Mesh UI](https://loft.sh/blog/images/content/vcluster-gloo-4.png)

You can also see the amazing graph to help with understanding your own system: Observability

![Gloo Mesh graph](https://loft.sh/blog/images/content/vcluster-gloo-5.png)

That is all! You have achieved full control of the network in a matter of minutes in your local environment.

Now, you can test any capability that Gloo Mesh offers, including:

* Any kind of the policies that Gloo Mesh offers (such as failover, fault injection, outlier detection, retries, timeouts, traffic control, mirroring, rate limiting, and header and payload transformation)
* Access control
* Isolation of the services
* WAF
* Authentication with OIDC
* Authorization with OPA

## Tips for vcluster

Interested in learning more about vcluster? Here a simple diagram about how vcluster works:

![vcluster architecture diagram](https://loft.sh/blog/images/content/vcluster-gloo-6.png)

In the workshop you have deployed three vclusters. If you run:

Bash
kubectl --context $MAIN_CONTEXT get sts -A


You will see:

Bash
NAMESPACE NAME READY AGE
devmgmt devmgmt 1/1 3h7m
devcluster2 devcluster2 1/1 3h2m
devcluster1 devcluster1 1/1 3h3m


Each of these StatefulSets belong to one vcluster. In its attached volume is stored all the data regarding the deployed vcluster.

Getting closer, you will find that one of the containers of those StatefulSets is an entire [k3s](https://k3s.io/), a lightweight Kubernetes flavor. You could also use any of the supported kubernetes flavors: [eks, k0s and vanilla k8s](https://www.vcluster.com/docs/operator/other-distributions).

The other container is a [syncer](https://www.vcluster.com/docs/architecture/basics#vcluster-syncer), an application which copies the pods that are created within the vcluster to the underlying host cluster. This is the reason you can see all the resources if you are the admin of the “host” cluster, and only your resources if you are the admin of the vcluster.

You can think of the StatefulSet like the control plane of a vcluster. This is the reason why you need to be careful how to deploy its pods.

Let’s see it in your just created environment. In your vcluster, you will see:

Bash
kubectl --context $MGMT_CLUSTER get pod -l app=gloo-mesh-mgmt-server -A

NAMESPACE NAME READY STATUS

gloo-mesh gloo-mesh-mgmt-server-9fb55d686-w4n4l 1/1 Running


But in the host cluster you will see:

Bash
kubectl --context $MAIN_CONTEXT get pod -A -l vcluster.loft.sh/namespace=gloo-mesh

NAMESPACE NAME
devcluster1 gloo-mesh-agent-df8c8c49d-jlhkh-x-gloo-mesh-x-devcluster1
devcluster2 gloo-mesh-agent-76b5b44b4f-56r5l-x-gloo-mesh-x-devcluster2
devmgmt gloo-mesh-mgmt-server-9fb55d686-w4n4l-x-gloo-mesh-x-devmgmt
devmgmt gloo-mesh-redis-794d79b7df-rlr99-x-gloo-mesh-x-devmgmt
devmgmt gloo-mesh-ui-cc98c5fc-tzq4s-x-gloo-mesh-x-devmgmt
devmgmt prometheus-server-647b488bb-r6hfc-x-gloo-mesh-x-devmgmt


Check the names. That is the translation layer that vcluster makes for you.

There are a couple of things to keep in mind when working with vclusters:

Reserve resources enough for those StatefulSet pods: It is a good practice to have nodes with resources dedicated solely to these pods and make sure that the pods are deployed in those nodes. The intention is that the StatefulSet pods (vcluster control planes) will not run out of resources which would dramatically impact the performance of the vcluster. To do this, you can play with taints and nodeselectors in the nodes.

Logs and Kubernetes metadata: Log Aggregators tools like [Fluentbit](https://fluentbit.io/) and [Grafana Promtail](https://grafana.com/docs/loki/latest/clients/promtail/) rely on the Kubernetes structure and naming convention. Log folders and files follow the kubernetes structure given by the host cluster.

From the command above, you could see that the same pod has different names in vcluster and in the host. Therefore, if you deploy one of the observability tools mentioned before in the vcluster, the expected structures will not match the one in the host cluster.The consequence is that the vcluster will not be able to leverage the Kubernetes metadata, nor the log traces from the applications in that cluster. This issue is currently being addressed by the [Loft Labs](https://loft.sh/) team at the time of writing this post.

The last interesting point to mention is the capability to pause/resume individual vcluster (StatefulSets). In case you do not want to destroy the entire environment created in the workshop you can just do:

Bash
vcluster pause $MGMT_CLUSTER -n $MGMT_CLUSTER --context $MAIN_CONTEXT
vcluster pause $CLUSTER_1 -n $CLUSTER_1 --context $MAIN_CONTEXT
vcluster pause $CLUSTER_2 -n $CLUSTER_2 --context $MAIN_CONTEXT


And whenever you want to keep working on the tests you can do:

Bash
vcluster resume $MGMT_CLUSTER -n $MGMT_CLUSTER --context $MAIN_CONTEXT
vcluster resume $CLUSTER_1 -n $CLUSTER_1 --context $MAIN_CONTEXT
vcluster resume $CLUSTER_2 -n $CLUSTER_2 --context $MAIN_CONTEXT




## Conclusions

Technology changes fast. Not many years ago, we were working with monoliths. Nowadays, you can have clusters deployed within another clusters.

Through this workshop, you were able to:

* Deploy all the components of Gloo Mesh in your local environment or in a cheap remote environment.
* Basic setup to test all Gloo Mesh capabilities to handle east-west and north-south traffic between your services.
* Reduce cost of deploying multiple clusters with vcluster. You just need one actual cluster.
* Reduce time of testing things out in a local environment.

This increases exponentially the efficiency in your projects. Which, at the end, is translated into an increase in productivity.

As a final comment, you can see that being able to test things in your local environment, reproducing heavy remote environments, is one of the goals of the DevOps practices.

If you want to talk more about all these tools, you can find me easily in these Slack workspaces: [solo.io](http://solo-io.slack.com), [istio](http://istio.slack.com) and [loft.sh](http://loft-sh.slack.com)

DEV Community