- Virtualization
- Resource Isolation
- Resource Control
- Containers
- Docker Joins The Stage
- Docker Growth
- Introducing Kubernetes
- Kubernetes And The Cloud
- Component Separation
- The Future?
At some point in a DevOps engineer's career they will hear about Kubernetes. It's not a matter of if, but when. With containers being the new hot thing and Kubernetes orchestrating their deployment it's easy to see the reason for its popularity. Even so, Kubernetes is an accumulation of previous technologies. In this article I'd like to discuss the previous way of handling similar application hosting solutions and show how they lead up to a lot of what Kubernetes offers today.
Virtualization
Before containers were popular, virtualization was the way you could split up a single server into multiple "containers". At first they relied on paravirtualization which is what the first EC2 instances used. Virtual Machines would interface to the operating system via some form of middle man which would translate the guest OS's calls to what the host OS would expect. Given the requirement of such a conversion these virtual machines wouldn't be able to keep up with running software straight on the hardware itself. While this wasn't a big issue for some solutions it did mean that adopting it widely where applications cared about performance was a difficult endeavour.
Hardware virtualization would start to gain traction through the usage of various CPU related technologies specifically to improve performance. Technologies such as KVM were made to further improve integration with this new technology. However, there were still issues with some hardware interaction that still made it not match up to bare metal. Another alternative was to simply use the same hardware, but isolate the application itself.
Resource Isolation
Resource isolation primarily started off with isolating how a process could operate. The most popular example of this was chroot (1979). More specialized examples of this included FreeBSD jails (2000) and Solaris Zones (2005) which still kept some of the virtualization aspects. Now chroot, which stands for change root, allowed a process to have their root mount point changed effectively isolating it on a filesystem level.
For these applications to properly run on a Linux system they need access to special paths which provide system information through a filesystem interface. This includes procfs, devpts, and sysfs. These allowed for different mount options for the chroot than the host system they were running on. More simple variants could be done with bind mounts. The underlying application also needed access to its dynamically linked libraries to run. Even static binaries still need access to a c library implementation (ex. glibc) and system calls making setup more difficult if you wanted best performance.
From a security standpoint this setup was beneficial for webservers who only needed access to their respective HTML. CSS, JS, and any dynamic content generation files. If a basic malicious actor obtained access to a local shell, they would only be able to see files inside the chroot. That's not to say that chroot's are some kind of unbreakable fortress. Another issue was that there wasn't a very easy packaging solution which would replicate the ease of use of a virtual machine image. This complexity only increase as more applications are added to a server. Out of the box chroots also don't provide any kind of resource quota mechanism as you would expect from modern day containers.
Resource Control
Several technologies were introduced to the Linux kernel to help address some of these issues. namespaces were introduced into the Linux kernel in 2002 for more fine grained isolation. They allowed for scoping resource availability on a process level. You could have one process see one set of resources, and another process see a different set of resources. Next is quotas on resources which has handled by cgroups, released in 2007. They provide a special filesystem mount to configure settings such as what CPUs a process can utilize, how much memory/cpu it can use, and how much network bandwidth it can use. The final set of controls are capabilities, introduced in 1999. While a foundation for process isolation its original purpose was for restrictions on privileged user accounts. Despite its early introductio, the full set of capabilities available today is something that was an accumulation of additions over several kernel releases. While these are powerful technologies, they are also fairly complicated to manage in a scalable manner. They also had the same issue with chroots in that there wasn't a very easy way to package everything together.
Containers
While Docker helped drive container adoption, it was LXC which provided the base foundation to make the system work. In fact early versions of Docker utilized it as a backend. It provided the packaging of kernel level features such as cgroups and namespaces which were used for container like isolation. Despite it's features LXC still lacked the modern day interface you'd expect to easily deploy containers. There was also a lack of major cloud managed service providers to help with overall adoption.
Docker Joins The Stage
The initial release of Docker was on March 20th 2013. Important to note is that containers are a more of a concept, and Docker is an implementation. What Docker brought to the table was a container image hosting service as well as software to orchestrate the more difficult parts of the container setup process. Docker also provided a very simplistic image definition: the Dockerfile. Compared to what needed to be done for virtual machine style builds Dockerfile builds simplified the process and used an incremental diff patching feature which enabled more modular organization of images. Images hosted on Docker's platform could be utilized in this process allowing for vendor solution tie-in. Having a business backing it also helped with corporate adoption of containers.
Docker Growth
Despite this great set of features Docker still would take some years to become the popular solution it was today. This was attributed to the security and scalability concerns of the original iteration. In 2014 Docker Compose was released as a solution to turning Docker containers into a more "project" like layout. This further helped organization of containers along with the incremental diff functionality. This was further improved via the introduction of Docker Swarm which was released as part of Docker 1.12 in 2016. This allowed a more scalable cluster setup for Docker containers.
Introducing Kubernetes
Kubernetes was first released on September 9, 2014. This release timeline is part of what helped it gain a foothold over Docker Swarm. It was an open source version of an internal Google project. Features of container orchestration were presented in a more modular fashion along with scaling functionality. You can chose how your networking stack works, your load balancing, container runtime, and filesystem interfaces. Availability of an API allowed for more programmatic interactions with orchestration, making it tie in very well with CI/CD solutions. However, the big issue it has is complexity of setup. Putting together a Kubernetes cluster with basic functionality is certainly no easy feat.
Kubernetes And The Cloud
Due to the amount of effort required not to just setup, but also migrate to Kubernetes, original enterprise adaptors would rely on the Google Cloud Kubernetes Engine to abstract the effort. The product was released shortly after Kubernetes itself, with the first release on November 4th 2014. Unfortunately, due to Google Cloud's 2 year gap with AWS's public release this still made Kubernetes adoption on a wider scale more difficult. AWS' first step into the container space was interestingly enough Elastic Container Service or ECS in 2015. While not bound with Kubernetes specifically the ease of deployment made it an interesting competitor for more simplistic container workloads. Google's best alternative to it, Google Cloud Run would not see an available release until 2018.
AWS would then release Elastic Kubernetes Services or EKS in 2018. Azure tagged on slightly afterwards with their own Azure Kubernetes Service or AKS in the same month. With the 3 major cloud providers all having a managed service Kubernetes started to see a large spike in adoption over the years. So much that a majority of DevOps job descriptions now have it as a technology stack component.
Component Separation
Kubernetes on the backend used to utilize docker for much of its container runtime solutions. One of the modular features of Kubernetes is the ability to utilize a Container Runtime Interface or CRI. The problem was that Docker didn't really meet the spec properly and they had to maintain a shim to translate properly. Instead users could utilize the popular containerd or cri-o runtimes. These follow the Open Container Initiative or OCI's guidelines on container formats.
Projects central to the container experience began to join the Cloud Native Computing Foundation. While it supports many popular open source DevOps related projects, much of the governing board and technical oversight committee are corporate oriented. Kubernetes itself also did a migration of their download site from Google to a more open one which would allow for adding additional mirrors.
The Future?
While Kubernetes is in what I consider to be its prime now, like all technologies there's certainly room for a new tech to take its place. Especially true given the complexity of it even when run in a managed environment. A bump in the right direction could certainly be made by having managed services with specific configurations for specific use cases to be available. Centralization of monitoring/security/performance data along with basic administration would also help it substantially.
If you like what you see here I'm currently available for remote full time work. Check my profile for links.
Top comments (0)