Eyar Zilberman for Datree

Posted on Jun 1, 2021 • Originally published at datree.io

A Deep Dive Into Kubernetes Schema Validation

#kubernetes #devops #tutorial #gitops

Why run schema validation?

How do you ensure the stability of your Kubernetes clusters? How do you know that your manifests are syntactically valid? Are you sure you don’t have any invalid data types? Are any mandatory fields missing?

Most often, we only become aware of these misconfigurations at the worst time - when trying to deploy the new manifests.

Specialized tools and a “shift-left” approach make it possible to verify a Kubernetes schema before they’re applied to a cluster. In this article, I'll address how you can avoid misconfigurations and which tools are best to use.

TL;DR

Running schema validation tests is important, and the sooner the better.

If all machines (local developers environment, CI, etc.) have access to your Kubernetes cluster, run kubectl --dry-run in server mode on every code change. If this isn’t possible, and you want to perform schema validation tests offline, use kubeconform together with a policy enforcement tool to have optimal validation coverage.

Available tools

Verifying the state of Kubernetes manifests may seem like a trivial task, because the Kubernetes CLI (kubectl) has the ability to verify resources before they’re applied to a cluster. You can verify the schema by using the dry-run flag (--dry-run=client/server) when specifying the kubectl create or kubectl apply commands, which will perform the validation without applying Kubernetes resources to the cluster.

But I can assure you that it’s actually more complex. A running Kubernetes cluster is required to obtain the schema for the set of resources being validated. So, when incorporating manifest verification into a CI process, you must also manage connectivity and credentials to perform the validation. This becomes even more challenging when dealing with multiple microservices in several environments (prod, dev, etc.).

Kubeval and kubeconform are command-line tools that were developed with the intent to validate Kubernetes manifests without the requirement of having a running Kubernetes environment. Because kubeconform was inspired by kubeval, they operate similarly — verification is performed against pre-generated JSON schemas that are created from the OpenAPI specifications (swagger.json) for each particular Kubernetes version. All that remains to run the schema validation tests is to point the tool executable to a single manifest, directory or pattern.

Comparing

kubeval
kubeconform
kubectl dry-run in ‘client’ mode
kubectl dry-run in ‘server’ mode

Now that we covered the tools that are available for Kubernetes schema validation, let’s compare some core abilities (misconfigurations coverage, speed test, different versions support, CRD support and docs).

Misconfigurations coverage¹

I donned my QA hat and generated some (basic) Kubernetes manifest files with some intended misconfigurations, and then ran it against all four tools²:

Misconfig/Tool	kubeval / kubeconform	kubectl dry-run in ‘client’ mode	kubectl dry-run in ‘server’ mode
API deprecation	✅ Caught	✅ Caught	✅ Caught
Invalid kind value	✅ Caught	❌ Didn't catch	🚧 Caught³
Invalid label value	❌ Didn't catch	❌ Didn't catch	✅ Caught
Invalid protocol type	✅ Caught	❌ Didn't catch	✅ Caught
Invalid spec key	✅ Caught	✅ Caught	✅ Caught
Missing image	❌ Didn't catch	❌ Didn't catch	✅ Caught
Wrong K8s indentation	✅ Caught	✅ Caught	✅ Caught

Conclusion: Running kubectl dry-run in ‘server’ mode caught all misconfigurations, while kubeval/kubeconform missed two of them. It’s also interesting to see that running kubectl dry-run in ‘client’ mode is almost useless because it’s missing some obvious misconfigurations, and also requires a connection to a running Kubernetes environment.

Benchmark speed test

I used hyperfine to benchmark the execution time of each tool⁴. First I ran it against (1) all the files with misconfigurations (seven files in total), and then I ran it against (2) 100 Kubernetes files (all the files contain the same config).

(1) Results for running the tools against seven files with different Kubernetes schema misconfigurations:

(2) Results for running the tools against 100 files with valid Kubernetes schemas:

Conclusion: We can see that while kubeconform (#1), kubeval (#2) and kubectl --dry-run=client (#3) are providing fast results on both tests, while kubectl --dry-run=server (#4) is working slower, especially when it needs to evaluate 100 files — 60 seconds for generating a result is still a good outcome in my opinion.

Kubernetes versions support

Both kubeval and kubeconform accept the Kubernetes schema version as a flag. Although both tools are similar (as mentioned, kubeconfrom is based on kubeval), one of the key differences between them is that each tool relies on its own set of pre-generated JSON schemas:

Kubeval - instrumenta/kubernetes-json-schema (last commit: 133f848 on April 29, 2020)
Kubeconform - yannh/kubernetes-json-schema (last commit: a660f03 on May 15, 2021)

As of today (May 2021), kubeval only supports Kubernetes schema versions up to 1.18.1, while kubeconform supports the latest Kubernetes schema available today — 1.21.0. With kubectl, it’s a little bit trickier. I don’t know which version of kubectl introduced the dry-run, but I tried it with Kubernetes version 1.16.0 and it still worked, so I know it’s available in Kubernetes versions 1.16.0-1.18.0.

The variety of Kubernetes schemas support is especially important if you want to migrate to a new Kubernetes version. With kubeval and kubeconform you can set the version and start the process of evaluating which configurations must be changed to support the cluster upgrade.

Conclusion: The fact that kubeconform has all the schemas for all the different Kubernetes versions available — and also doesn’t require minikube setup (as kubectl does) — makes it a superior tool when comparing these capabilities to its alternatives.

Other things to consider

Custom Resource Definition (CRD) support
Both kubectl dry-run and kubeconform support resource type CRD, while kubeval does not. According to kubeval docs, you can pass a flag to kubeval to ignore missing schemas, so it will not fail when testing a bunch of manifests for which only some are resource type CRD.

Documentation
Kubeval is a more popular project than kubeconform, and therefore, its community and documentation are more extensive. Kubeconform doesn't have official docs but it does have a well-written README file that explains pretty well its capabilities. The interesting part is that although Kubernetes native tools, like kubectl, are usually well-documented, it was really hard to find the necessary information needed to understand how the dry-run flag actually works and its limitations.

Conclusion: Although it’s not as famous as kubeval, the CRD support and good-enough documentation make kubeconform the winner in my opinion.

Comparison summary

Item/Tool	kubeval	kubeconform	dry-run client	dry-run server
Misconfigurations coverage	+/-	+/-	-	+
Benchmark speed test	+/-	+	+/-	-
Kubernetes versions support	-	+	+/-	+/-
CRD support	-	+	+	+
Documentation	+	+/-	-	-

Now that you know the pros and cons associated with each tool, here are some best practices for how to best leverage them within your Kubernetes production-scale development flow.

Strategies for validating Kubernetes schema using these tools

⬅️ Shift-left: When possible, the best setup is if you can run kubectl --dry-run=server on every code change, but you probably can’t do it because you can’t allow every developer or CI machine in your organization to have a connection to your cluster. So, the second-best effort is to run kubeconform.
🚔 Because kubeconform doesn’t cover all common misconfigurations, it’s recommended to run it with a policy enforcement tool on every code change to fill the coverage gap.
💸 Buy vs. build: If you enjoy the engineering overhead, then kubeconform + conftest is a great combination of tools to get good coverage. Alternatively, there are tools that can provide you with an out-of-the-box experience to help you save time and resources, such as Datree⁵ (whose schema validation is powered by kubeconform).
🚀 During the CD step, it shouldn’t be a problem to have a connection with your cluster, so you should always run kubectl --dry-run=server before deploying your new code changes.
👯 Another option for using kubectl dry-run in server mode, without having a connection to your Kubernetes environment, is to run minikube + kubectl --dry-run=server. The downside of this hack is that it’s also required to set up the minikube cluster like prod (same volumes, namespace, etc.) or you’ll encounter errors when trying to validate your Kubernetes manifests.

GRATITUDE

Thank you to Yann Hamon for creating kubeconform - it’s awesome!
This article wouldn’t be possible without you. Thank you for all of your guidance.

All the schemas validation tests performed against Kubernetes version 1.18.0 ↩
Because kubeconform is based on kubeval, they provide the same result and run them against the files with the misconfigurations. kubectl is one tool but each mode (client or server) produces a different result as you can see from the table ↩
Server mode didn’t mark the file as valid (exit code 1) but the error message is wrong: Kind=pod doesn't support dry-run ↩
All benchmark test performed on my MacBook Pro with a 2.3 GHz Quad-Core Intel Core i7 processor ↩
Disclaimer - self-promotion here :) ↩

Oldest comments (6)

Shimon Tolts Datree • Jun 2 '21

Kubeconfirm seems like a more modern solution then kubeval imo 🤷‍♂️

Eyar Zilberman Datree • Jun 2 '21 • Edited

don't forget that kubeconform was inspired by kubeval, but I also like this tool because the primary maintainer (yannh) is still active in this project. In contrast, kubeval primary maintainer (garethr) is more involved in his other project right now...

Roman Labunsky Datree • Jun 2 '21

The ability to test for different k8s versions is priceless, planning a cluster upgrade and easily knowing what's about to change can save so much time and make for a smooth upgrade experience. Any solution that doesn't solve this easily is not worth it.

Eyar Zilberman Datree • Jun 2 '21

I agree, but this functionality is only relevant when you want to upgrade your Kubernetes cluster, and you want to plan your migration (which config needs to change).
Usually, this is not the main use-case for schema validation, so I don't recommend only to consider this factor when choosing the right tool for you.

Dima Brusilovsky • Jun 2 '21

Great comparison. Its a shame that kubectl doesn't have a simple validation solution for their own project. How great would it be if they had a simple and easy-to-use solution?!
Great to see the community pitching in

Eyar Zilberman Datree • Jun 3 '21

I think that kubeval and kubeconform is in fact the community pitching in :)
I'm more surprised from the lack of proper documentation for the kubectl --dry-run flag.