You definitely read a lot of pieces praising Kubernetes. In fact, Kubernetes played a positive role in the evolution of software development, especially in continuous integration and continuous delivery (CI/CD). It brought it to a new level, letting developer teams increase the number of releases from a few per year to a few per month.
We only want to provide you with a balanced perspective on this tool. And — spoiler! - it seems that this blog post is going to unveil a lot of Kubernetes’ drawbacks. What are they and how to use this great technology without pain? Let us explain our vision.
CI/CD is a process that needs a lot of coordination. When you use containers, you isolate applications or microservices. But on the deployment stage, you want to make them collaborate again. One of the tools that enables this is Kubernetes. It does not revert the isolation, though. The idea of Kubernetes is that the containers are kept as they are, but the freshly developed applications can be added to the existing entities.
Another important task of Kubernetes is to manage your computing resources. When you have a lot of applications to run, you inevitably want to use every vacant CPU capacity. The Kubernetes architecture is bred exactly for this.
Kubernetes may screw up its task of efficient resource management. Those problems are mostly consequences of some mis-configuration. We will get to this in the next section.
Once you get Kubernetes working smoothly for you, the benefits will manifest themselves. The problem is, there is a long and thorny way ahead of you if you decide to try doing Kubernetes on your own.
At first, you will have to set up the complete infrastructure. You will need to arrange a few bare-metal or virtual machines. Then you need to tie them together. In this construct, you’ll have normal control-plane traffic and container traffic that are not allowed to mix. You need to work out a sophisticated network configuration that will keep them apart.
Let us take a step back and go through the main Kubernetes concepts. They can help you to understand why the network configuration always becomes tricky when you set up your Kubernetes clusters.
We will proceed from the smallest to the biggest concepts to help you in understanding Kubernetes and its architecture. Containers are those wrappers where you can enclose a piece of your application and a minimum set of operating system components that are essential and sufficient for the application to run.
Containers are used as alternatives to virtual machines or as supplementary to them to develop applications quicker and make them more OS-agnostic. Outside of the CI/CD, containers are used to scale applications.
Containers can utilize the resources of more than one physical or virtual machine. The borders of a container are not equal to those of your physical engine. That’s why we had to outline containers as a separate concept.
But containers are not used alone. They are just a part of the bigger picture.
The next level in the Kubernetes universe is pods. Pods simply mean a few containers that are tied together. Those can be microservices that, as a group, replace a monolith application.
Why do we need this concept? It is built upon the container concept of not being equal to using only one CPU. Consequently, pods run across a few physical or virtual machines. But, oddly enough, pods share the same IP address, which, in normal reality, would be a unique ID of a physical machine or a server.
Containers in a pod share the same storage and network resources. Thus, pods and containers are abstract workhorses in your Kubernetes setup. They need to be related to your physical resources.
That’s where nodes come into play. Nodes are physical and virtual machines. It’s already confusing enough since you have virtual machines on physical ones eating the resources of the latter. And now you have pods stretched across a few of the machines of whatever type. But we are not finished.
Kubernetes technology would not rock the IT industry without this last concept. Clusters take care of the scalability of your applications. Clusters are physical machines, mainly servers, that host your nodes. These machines can be distributed around the world, but the pod will make them collaborate, thus providing you with computing power when you need it and shrinking back when nothing big happens.
Clusters were made to use free CPU capacities on-demand. Their flexibility was intended for speeding up applications. When you can run your applications faster, and run and test more of them at the same time, you can release your software updates faster and more often.
Why cannot every company reach this ideal of CI/CD easily?
As already mentioned above, you have to arrange your nodes to host pods and containers inside them to work smoothly. When you decide to deal with Kubernetes on your own, you will have to take care of setting up the clusters: for instance, renting the physical servers and configuring the network.
On top of this, you have to configure a lot of different types of permissions and access rights. It is not only about the users that have to be able to run deployments or testing. Not every container should have the same access rights. This would be extremely dangerous. Once a container gets a bug or security issue, it can spread the problem across the whole network.
By default, Kubernetes won’t send you any notifications if some of your deployments failed. You’ll have to set up additional tools to monitor log files and be able to react quickly to any problems.
Network, security, and monitoring already sound like a full-time job. In a big company with a lot of deployments going on, you will need a whole team to manage Kubernetes clusters.
By a nice but quite unrealistic coincidence, you may have specialists that can handle that all for you. In most cases, you will have to employ new people or reserve a few months of time for your developers to learn Kubernetes.
Personnel costs are not the only expenditures. As already mentioned, the infrastructure needs a huge investment, too.
Why do people still use Kubernetes?
If you have a lot of deployments to manage, and if you configured Kubernetes properly, there are plenty of advantages you’ll enjoy.
On one side, Kubernetes works seamlessly with most of the popular clouds and server providers. Kubernetes is open-source, and its enthusiasts keep enhancing it.
On the other side, Kubernetes can work with different containers and not only with the Docker ones. You can leave any vendor lock-ins in the past.
Kubernetes orchestrates resources in two ways. It fits containers into nodes that have free capacities. It also distributes the incoming traffic between existing containers in a clever way and, what’s more important, without your active participation. This is known as load balancing and relies on different algorithms to re-direct traffic to less busy servers and prevent the busy ones from overloading.
Kubernetes can autonomously deal with containers that stop working due to a failure in hardware or software. This ability is important for the post-deployment phase, keeping all of your applications up and running all the time.
Last but not least, Kubernetes manages the size of its clusters automatically. When an application needs more CPU and memory, Kubernetes will scale these parameters up, preventing your application from getting stuck.
The nodes communicate with each other via API, and the master nodes use API to talk to the external world. It is possible to set up TLS encryption for both types of network traffic. But this is something you have to take care of by yourself.
The result is worth it. Hackers cannot simply intrude into the traffic sent between the involved servers. This makes Kubernetes clusters robust against cyber attacks and secure for running real-time business applications and applications that contain very sensitive data.
Although access control is not the easiest with Kubernetes, the role-based model it uses helps system administrators to manage access rights efficiently. The same can be said about containers and pods. They can be granted different permissions that prevent them from enabling attackers to overthrow the entire system.
In the end, we do not want to discourage you from using Kubernetes. We only want to warn you against the pitfalls of trying it out on your own. Kubernetes is very complex to implement, hard to learn, and may be time-consuming in maintenance.
It is difficult to balance its benefits and drawbacks. From the business perspective, they can be roughly calculated in a money equivalent, for instance, as man-day costs of managing Kubernetes in-house. But the downsides may also become a demotivating factor for your team. Nobody wants to feel confused and helpless.
Many companies have gone through painful homemade Kubernetes setups. For ourselves, we formulated a few best practices for using Kubernetes on your own, and we want to bring them to your attention.
Like every other software system, Kubernetes releases regular updates that may be important for its compatibility with other tools, security, issues, as well as to handle past bugs or bring improvements.
This great conductor may perform even better if you play around with resource parameters by yourself. In most cases, you will force Kubernetes to reserve some memory or CPU per node to prioritize nodes or tasks in an indirect way. It helps you to focus on the most important applications that need to run and pause the unimportant ones if they start to cannibalize the node resources.
If you do not want to set the pod size manually as described in the previous subsection, you can use resizing and autoscaling. They do a similar thing in an automated manner. As you may guess, the only difference is that the parameters may change dynamically, making resource usage even more efficient.
Pods can start even faster if you think your infrastructural setting carefully through. For instance, you can reduce latency by bringing the Kubernetes cluster and your Docker repository close to each other. The opposite example would be a Kubernetes cluster residing on European servers and a GitLab repo on a US server. This setting would increase the container pull time enormously.
Apart from this, you can test different Docker image sizes and check nodes availability before starting your pods.
Logs can provide valuable insights into your Kubernetes clusters. They track success and failures and help to localize weak points. Since the number of logs and single events can explode rapidly for big organizations, we recommend using a data visualization tool on top of a log monitoring tool. The established pair of tools are Prometheus and Grafana.
Even with the best practices, using homemade Kubernetes still may sound like a lot of work with little guarantee to make things right.
The pros and cons of a managed approach depend a lot on your company and your provider. There is no universal list of them. Basically, you will need to pay for the managed service, but considering the risk of unexpected expenditures of your self-managed Kubernetes, that may not be the biggest point.
Indeed, you will outsource your clusters to a company that has more expertise and already incorporated best practices into its infrastructure. As we said before, building the right infrastructure is one of the most cumbersome tasks during the Kubernetes setup. So, for this one, you will definitely want to find a service provider.
Using a managed Kubernetes platform will speed up your deployment pipelines.
With the right Kubernetes provider, you will not only skip the setup phase. You can deploy applications within a very short time without any manual tuning. A managed Kubernetes service allows you to separate production and development environments to secure the uninterrupted running of the former.
Managed Kubernetes clusters will scale automatically, and you do not have to think about expanding your computing power and revising any provider agreements, subscriptions, etc. If you still have questions, you can forward them to the provider’s support. Either they will fix it themselves or they will guide you through a solution, sharing their expertise.
The key is the right provider, not just any provider! That’s why in the final section, we will explain in detail how Engine Yard, a data-driven, NoOps, PaaS solution for deploying and managing applications on AWS incorporates advantages and best practices of a managed Kubernetes.
Engine Yard has recently launched a new service called Engine Yard Kontainers (EYK). This service is the result of a one-year project during which the company thoroughly built a suitable infrastructure that is now available for its customers.
And you have at least seven good reasons to use it:
Engine Yard carefully monitors all Kubernetes updates and keeps the application downtime to the lowest possible minimum. Besides, it runs not only the latest Kubernetes version, but also those that are necessary to avoid failures on the customer side.
The four resource metrics (minimum and maximum CPU and minimum and maximum memory) are managed through a single metric introduced exclusively by Engine Yard. It is called Optimized Container Unit (OCU), and equals 1GB of RAM and proportionate CPU.
Engine Yard offers two ways to improve your pod start times. First, it uses an algorithm that optimizes node availability. Second, it provides application stacks that are customized for certain runtime environments.
Engine Yard uses the standard AWS Elastic Load Balancer (ELB) enhanced through an NGINX-based load balancer that is particularly powerful for running web applications. The combination of the two — AWS ELB and NGINX — allows scaling traffic even more precisely.
With Engine Yard, you’ll get the whole issue set up out-of-the-box. Engine Yard uses such technologies as Fluent Bit, Elasticsearch, Kibana, Prometheus, and Grafana to collect and visualize your logging data and make it work for you. Alerts are included!
Using Engine Yard won’t add any new tasks on top of what you are already doing: simply push a commit or new branch to your Git repository and see it deployed in the cloud. It is possible to roll back buggy or problematic updates without interrupting your other CI/CD routines.
Whereas your rivals are busy finding DevOps specialists, you can deploy your applications with the existing team and focus on the development itself. Application stacks will also provide you with prebuilt container images that are tailored for the most common use cases.
Engine Yard helps you get the best out of containerization and Kubernetes. Despite all the drawbacks, these two remain the leading technologies in the CI/CD.
Engine Yard is ready to discuss your needs and provide a consultation. Feel free to use any resources on our website, including but not limited to blog posts, whitepapers, and documentation that can help you in making your choice. Once you start working with us, our support team is here to handle any issues and keep your Kubernetes clusters up and running for a better deployment experience.