With the success of Kubernetes (K8s), many teams have gone from operating just a handful of clusters to operating “fleets” of K8s clusters—both on-premises and in the cloud. As if the growing pains that come with scaling clusters rapidly aren’t bad enough, having a growing number of applications on different clouds, running different K8s distributions and many add-ons has resulted in significant management challenges.
It can be difficult to automate fleet-wide operations or to get a unified, enterprise-wide view of cluster operations and health information. You may find it difficult to maintain configuration consistency at both the cluster and application level across clusters, while also accommodating unique requirements imposed by various internal teams.
This blog explores the concepts behind Kubernetes fleet management and introduces four pillars that are essential for fleet management success.
What is a Kubernetes Fleet?
A simple definition is that a Kubernetes fleet is a logical set of clusters to be managed as a single domain. Your fleet could consist of all the K8s clusters and applications across your company, the set of clusters and apps that you and your team are responsible for, or a set of clusters that all have similar management needs.
Depending on how your company is organized and how it operates, a fleet might include clusters used for development, test, and production. It might also include clusters running on-premises, in one or more public clouds, and in edge locations like retail stores, warehouses, distribution centers, or regional offices.
The truth is, it’s up to you to define what your fleet looks like. Chances are you already have a pretty good idea, but you may be struggling to operationalize your K8s fleet(s) in a way that allows you to monitor and manage them easily.
What is Kubernetes Fleet Management?
We can define fleet management as a process for managing, monitoring, and governing a heterogeneous fleet of K8s clusters and associated apps. In other words, fleet management is how you transition from managing each cluster individually to managing and governing global cluster functions such as security, configuration, and monitoring collectively for a large set of Kubernetes clusters in a centralized manner.
Pillars of Fleet Management
As you plan your fleet management strategy, there are four broad pillars of Kubernetes fleet management that you should be thinking about. Depending on your industry and organizational needs, some of these may be a higher or lower priority, but every organization likely has some requirements in all four areas. These pillars are the foundation of fleet management: Automation, Security, Visibility, Governance. The following sections examine each of these in turn.
Pillar 1: Automation
Managing Kubernetes with kubectl commands and a few scripts when you only have a few clusters might not be too difficult, but this simply doesn’t scale. By automating and standardizing common cluster and application operations, you can manage more clusters with less effort while avoiding misconfigurations due to human errors.
Here are some common Kubernetes and application tasks that you may want to automate:
Pillar 2: Security
Mission-critical clusters and applications running in production require the highest level of security and control. As your fleet grows, your enterprise is exposed to security risks that aren’t evident when you were only operating a few clusters.
Applying zero-trust principles is the best way to secure your K8s environment. Kubernetes includes all the hooks necessary for zero-trust. Unfortunately, keeping all the individual elements correctly configured and aligned across dozens of clusters is a challenge, especially when multiple workloads and Kubernetes distributions are involved.
You need to be able to integrate with your existing SSO, manage authorization with role-based access control (RBAC) across your fleet, and provide an end-to-end audit trail.
See our recent blogs on Securing Kubernetes: Applying Zero-Trust Principles to Your Kubernetes Environmentand Secure Operations for Kubernetes Clusters and Applications for more on this topic.
Pillar 3: Visibility
Your team can’t manage and support what they can’t see. You need a single, fleet-wide view of Kubernetes clusters and applications that includes resources consumed, user and access activity, critical alerts, and overall health.
While there are a number of open source and commercial Kubernetes monitoring tools that can make monitoring easier, implementing these tools in a large fleet creates significant complexity. Many organizations find that monitoring as a service is a better alternative as their fleet grows. The right fleet monitoring tool should give you all the metrics you need in one place, while integrating with whatever monitoring you already use.
See the recent blog, Best Practices, Tools, and Approaches for Kubernetes Monitoring for more on fleetwide visibility and monitoring.
Pillar 4: Governance
As the complexity of your K8s fleet grows, it becomes increasingly difficult to ensure you’re complying with security policies and industry regulations. Especially in regulated industries like Financial Services and Healthcare, organizations need fleet management tools that can facilitate governance and enforce policy-based management.
Here are some common Kubernetes and application governance capabilities to consider:
Practical Considerations for K8s Fleet Management
While the four pillars just described are the bedrock of fleet management, as you plan your Kubernetes fleet management strategy, there are several more mundane but important considerations as well. Most organizations require some or all of the following:
Centralized management: Monitor and manage your fleet from a central console using the same core tooling everywhere. While deploying several best-of-breed tools isn’t out of the question, each additional tool adds complexity and increases the learning curve.
Flexibility to run on any infrastructure: Fleet management should adapt to your requirements instead of forcing you to adapt your operations.
Integration with a broad range of tooling: Any solution(s) must integrate with the K8s distributions and tools you already rely on.
Easy to consume: Some tools themselves require a lot of management. If you have to spend a lot of time installing, configuring, and updating tools, agents, etc. that’s time you won’t get back. Many organizations prefer a Software-as-a-Service (SaaS) model that reduces overhead and level of expertise required.
Want to learn more? Read on here!
Top comments (0)