Melissa Sussmann

Posted on Jun 11, 2020 • Originally published at relay.sh on Apr 28, 2020

What is Knative? Intro to Canary and Blue-Green Deployments with Dashboards (no YAML)

This article is a continuation of the “What is Knative” series, by Dimitri Tischenko. For part 1, please follow this link. In part 2 of this series, we will review canary and blue-green deployments and dashboards while using no YAML!

If we raise the load on our service, Knative will autoscale it even further. Exactly how Knative should autoscale can be configured globally or per service.

To load our helloworld service we are first going to modify its configuration. By default, a service is configured to handle 100 concurrent requests. We will reduce it to 10 to be able to show the effects of autoscaling.

> kn service update helloworld --concurrency-limit=10

Configurations, Routes, Revisions

To understand what happens now, we need to take a look at other Knative objects:

> kubectl get crd|grep serving.knative
configurations.serving.knative.dev 2020–04–02T12:13:20Z
revisions.serving.knative.dev 2020–04–02T12:13:21Z
routes.serving.knative.dev 2020–04–02T12:13:21Z
services.serving.knative.dev 2020–04–02T12:13:22Z

As we can see, there are 4 Custom Resource Definitions inside our K8S cluster which are specific to Knative Serving: configurations, revisions, routes and services. Because Kubernetes has a native resource type “service”, the Knative one is called “kservice” or “ksvc”.

A Knative Service represents the microservice, the app which we just deployed. The service has a route which defines under which url it is accessible. It also has a configuration — the combination of code and settings — which can be versioned in revisions.

> kn service describe helloworld
Name: helloworld
Namespace: default
Age: 19m
URL: [http://helloworld.default.127.127.127.127.xip.io](http://helloworld.default.127.127.127.127.xip.io)

Revisions:
  100% @latest (helloworld-pflmj-2) [2] (1m)
    Image: gcr.io/knative-samples/helloworld-go (pinned to 5ea96b)

Conditions:
  OK TYPE AGE REASON
  ++ Ready 1m
  ++ ConfigurationsReady 1m
  ++ RoutesReady 1m

We see under Revisions that 100% of the traffic goes to the revision helloworld-pflmj-2 a.k.a. @latest, and we also see which image version is used in that revision.

Let’s check out our revisions:

> kn revision list
NAME SERVICE TRAFFIC TAGS GENERATION AGE CONDITIONS READY REASON
helloworld-pflmj-2 helloworld 100% 2 115s 3 OK / 4 True
helloworld-snxyt-1 helloworld 1 20m 3 OK / 4 True

We see that we actually have 2 revisions and the second revision gets 100% of the traffic. How come we have 2 revisions? Well, remember we changed the concurrency limit on our service? That’s when a new revision was created:

> kn revision describe helloworld-pflmj-2
Name: helloworld-pflmj-2
Namespace: default
Age: 3m
Image: gcr.io/knative-samples/helloworld-go (pinned to 5ea96b)
Concurrency:
Limit: 10
Service: helloworld
Conditions:
OK TYPE AGE REASON
++ Ready 3m
++ ContainerHealthy 3m
++ ResourcesAvailable 3m
I Active 2m NoTraffic

We clearly see the concurrency limit set to 10 in this revision.

Autoscaling a Knative service

Now we are ready to generate load on the service. We will use a tool called hey, but hey, you are welcome to use a tool of your choice:

> hey -z 30s -c 50 "http://helloworld.default.127.127.127.127.xip.io"

This will generate 50 concurrent requests to our service during 30 seconds. Since our concurrency limit was set to 10, we now expect 5 pods to get started to handle all the traffic.

Our watch confirms our theory:

> watch kubectl get pod
NAME READY STATUS RESTARTS AGE
pod/helloworld-pflmj-2-deployment-84789457b-2cvzw 2/2 Running 0 13s
pod/helloworld-pflmj-2-deployment-84789457b-lzcv8 2/2 Running 0 13s
pod/helloworld-pflmj-2-deployment-84789457b-qm9ds 2/2 Running 0 14s
pod/helloworld-pflmj-2-deployment-84789457b-rf75q 0/2 ContainerCreating 0 11s

pod/helloworld-pflmj-2-deployment-84789457b-rhj65 0/2 ContainerCreating 0 13s

Again, after becoming idle these pods will be terminated.

Routes

But what if we want to deploy a new version of our app but not move it in production yet?

Our helloworld app supports an environment variable TARGET — if we set it to a message, that message will be returned to us in the response. So let’s use that to simulate releasing a new “testing” version of our app.

Obviously, doing kn service update helloworld --env TARGET=testing doesn’t work because this will route all traffic to the new version which we wanted to prevent.

To make this work, we first need to specify that the traffic should remain on the current version. We will use the feature called ‘tags’:

> kn service update helloworld --tag helloworld-tprvf-2=production --traffic production=100

> kn revision list
NAME SERVICE TRAFFIC TAGS GENERATION AGE CONDITIONS READY REASON
helloworld-tprvf-2 helloworld 100% production 2 94s 3 OK / 4 True
helloworld-mtfrw-1 helloworld 1 101s 3 OK / 4 True

We defined a tag ‘production’, assigned it to the current version and specified that it should get 100% of the traffic. Now we can deploy a new testing version and tag it as testing:

> kn service update helloworld --env TARGET=testing
> kn revision list
NAME SERVICE TRAFFIC TAGS GENERATION AGE CONDITIONS READY REASON
helloworld-rstwg-3 helloworld 3 55s 4 OK / 4 True
helloworld-tprvf-2 helloworld 100% production 2 94s 3 OK / 4 True
helloworld-mtfrw-1 helloworld 1 101s 3 OK / 4 True
> kn service update helloworld --tag helloworld-rstwg-3=testing
> kn revision list
NAME SERVICE TRAFFIC TAGS GENERATION AGE CONDITIONS READY REASON
helloworld-rstwg-3 helloworld testing 3 12m 4 OK / 4 True
helloworld-tprvf-2 helloworld 100% production 2 13m 4 OK / 4 True
helloworld-mtfrw-1 helloworld 1 13m 3 OK / 4 True

We have now tagged our new version as ‘testing’. 100% of the traffic is still sent to production, as we see in the revision list. It turns out that tagging automatically creates a new route so we can access our testing version as follows:

> curl http://testing-helloworld.default.127.127.127.127.xip.io

Hello testing!

> curl http://helloworld.default.127.127.127.127.xip.io

Hello World!

We can now test our new version in isolation.

Blue and green canaries

After testing, we are now ready to move our testing version to production. Since production is the only really representative testing environment, instead of replacing the production version immediately, we would like to send a percentage of the traffic to the new version — a process called ‘canary testing’ or ‘canary deployment’.

> kn service update helloworld --traffic testing=10,production=90
> kn revision list

NAME SERVICE TRAFFIC TAGS GENERATION AGE CONDITIONS READY REASON
helloworld-rstwg-3 helloworld 10% testing 3 21m 4 OK / 4 True
helloworld-tprvf-2 helloworld 90% production 2 22m 4 OK / 4 True
helloworld-mtfrw-1 helloworld 1 22m 3 OK / 4 True

We now see the intended traffic distribution. If we now do

> curl http://helloworld.default.127.127.127.127.xip.io

we will get about 10% of “Hello testing!”s and 90% of “Hello World!”s. After we are satisfied that our testing revision is performing properly, we can tag our testing version as production and send 100% traffic to is using the mechanisms explained above.

A different approach is a so-called “blue-green” deployment. In that scenario, we imagine that our current production environment is tagged ‘blue’. We tag the new production version ‘green’ and switch 100% of the traffic to it. If drama happens, we quickly switch traffic back to ‘blue’ and start solving bugs in ‘green’.

Let’s start from scratch. First, let’s delete our service:

> kn service delete helloworld

Let’s create our blue version:

> kn service create helloworld --image gcr.io/knative-samples/helloworld-go --env TARGET=blue --revision-name blue

Here, we used the --revision-name option to specify the revision name instead of letting Knative generate one for us. This means that we can use the revision name and can omit the tagging. In practice, tagging is more flexible because that is independent of revisions and moving a tag is easier than renaming revisions.

Next we will pin 100% traffic to the blue version so traffic will stick to it when we deploy a new revision:

> kn service update helloworld --traffic helloworld-blue=100
> curl http://helloworld.default.127.127.127.127.xip.io
Hello blue!

We see that the blue version is now live. Let’s now create our green version:

> kn service update helloworld --revision-name green --env TARGET=green
> kn revisions list
NAME SERVICE TRAFFIC TAGS GENERATION AGE CONDITIONS READY REASON
helloworld-green helloworld 2 5m35s 3 OK / 4 True
helloworld-blue helloworld 100% 1 7m27s 3 OK / 4 True
> curl http://helloworld.default.127.127.127.127.xip.io
Hello green!

We successfully switched to the green version. We can switch back anytime:

> kn service update helloworld --traffic helloworld-blue=100
> curl http://helloworld.default.127.127.127.127.xip.io
Hello blue!

That was easy — we just implemented a blue-green deployment!

Please note that the real world is often tricker, especially if you have a storage backend with changing schema across service versions. Knative is still a big help, since it removes a lot of burden from deploying the web services themselves.

Knative Dashboards

Knative comes with pre-configured monitoring components. In this example we have installed Grafana and Prometheus, which enable us to view nice dashboards of our services.

This command will forward the localhost port 3000 to the grafana service in our kubernetes cluster:

> kubectl port-forward --namespace knative-monitoring $(kubectl get pods --namespace knative-monitoring --selector=app=grafana --output=jsonpath="{.items..metadata.name}") 3000

Now, we can access the dashboards via ”http://localhost:300” in our local browser.

Epilogue

Summarizing, we have explored:

Installing Knative
Deploying and (auto)scaling a service
Canary and blue-green deployments
Knative Dashboards

I hope this overview has provided you with enough information and got you excited to start exploring Knative for yourself.

References:

If you are interested in moving your CI/CD pipeline to Kubernetes, check out the Tekton blog by Eric Sorenson. Fun fact: Tekton originated from a third component of Knative, “Build”, which has since then moved away from Knative into the Tekton project.

This educational content is brought to you by Relay. Relay is an event-driven automation platform that pulls together all of the tools and technologies you need to effectively manage your DevOps environment.