Warnings:
1. Some concepts in this article are explained like it's meant for 5-year-olds. If you're already an advanced user, this blog post is not for you.
2. If you want to see some code, scroll to the How subheading of this blog post. If everything there seems foreign to you or you don't get the point, you can come back to the very beginning.
For many people, Docker and Kubernetes are the things they learn halfway and drop, never to touch again after finding alternatives. Some, after learning it, never get to use it on their jobs because they have DevOps/platform teams, but once in a while, they pop up again. They're required to learn/relearn it. Usually, they have to relearn it from scratch.
Why Do Docker and Kubernetes Never Stick For Many People?
Why is Docker and Kubernetes(especially Kubernetes) tricky for many to be proficient in? Why is it so easy to forget everything once you stop using them?
As humans, we learn best through association. It is easy to remember new things when you can associate that learning with a simpler but related concept. Your entire knowledge base is a tree, and new learnings need to be attached to existing branches of that tree. The reason why people forget Docker and Kubernetes(k8s) quickly is that they encapsulate the significant pillars of computer systems, i.e., Operating systems, programs(running in userland), and computer networking. My aim in this series is not to graze over these fundamental topics but to go in-depth as necessary. Understanding those basic concepts would increase your chances of not having to relearn everything from scratch each time you return to them.
The Whats
What is
A Container
A container is a basic unit of deployment. That's vague, so let's rephrase it. A container is your program coupled with a virtualized operating system alongside every other dependency needed to run it. The virtualized operating system part of that definition is essential.
A Container Image:
A container image is "an executable", runnable by a container Engine.
A Container Engine/Container Runtime:
A container engine is a program that knows how to build and execute a container image.
Docker
Docker is a container technology company. The Container engine they created is called the Docker Engine; a container image built with docker-engine is called Docker Image. An executed Docker image is called a Docker Container.
A DockerFile
A docker file contains information on what a docker image should have.
Let's Relearn All The Concepts Above By Associating It With What We Already Know From Bottom Back To The Top
I assume anyone reading this article has written a program at some point, so let's compare the process of creating a program to creating a container.
- To write a C program, you must first write some instructions in a .c file.
- A docker file is the source code that contains instructions about what a docker image should have, which is usually a description of the OS your program is built to run on, the dependencies of your program, your program itself, and an instruction about how to start it.
-
To transform your C source file into something your machine can understand, you must build the .c file using a c compiler.
- To transform your DockerFile to a docker image, you must "compile" it with the "docker image builder." The builder comes with the docker engine, and you have access to it through docker-cli
-
To run the C executable, you can either open a command line interface and call the executable directly or, if you have access to a GUI, you will double-click on the executable.
- To run the docker image, you run it through docker-cli.
The Why
I already know how to run a program on a computer; why do I also need to run it in a container?
Yes, you could rent a physical machine and run your web app on it. Well, these days, cloud companies will only give you that deal if you're Facebook or Netflix. If you're a regular Joe like me who wants to run a portfolio website, you won't get a contract to run it on a physical server.
Yes, I know this already, but these businesses can still run multiple programs on a single machine.
Yes, they can. There is already a business model for this. It's called Shared Hosting. In this hosting model, the provider runs multiple customer programs on a single machine. This model is more cost-effective for both you(the customer) and the service provider but not suitable for fast-growing startups. One downside to this approach is that websites with heavy traffic might use the shared server resources (CPU and memory), leaving your website with little to no resources.
Another issue that arises in shared hosting is security. A customer running a malicious program on the same server as yours can also infect your website. Of course, these businesses take measures against worse-case scenarios like this, but it requires the expertise of specialized system admins.
Programs running in containers, on the other hand, by default don't have this problem because a container is a completely isolated environment, and resource limits can be easily assigned to the container as part of their startup instructions.
Talking about Isolation and resource allocation, isn't that achievable Through Virtual Machines? Why Do I Still Need Containers?
Yes, there is also already a business model for that. Today, you can rent a virtual machine on any cloud platform. They give these things different marketing names, like EC2 on AWS or Cloud Compute Engine on Google Cloud. Irrespective of what cloud platforms call it, you get a virtual machine running some open-source or proprietary distribution of the Linux operating system.
So, if virtual machines solve the problem with shared hosting, why do you still need containers?
Well, businesses are always trying to do two things: make more money and spend less money. Virtual machines virtualize the hardware, so for every virtual machine running on a physical machine, at the very minimum, a virtual CPU, memory, hard drive, and a virtual ROM containing a virtual bootloader would be created. The virtual machine also boots up a full-fledged operating system just to run a Hello World program. Also, for every client hosting software on a physical machine, a new virtual machine would have to be provisioned for them.
All of these leave a huge resource footprint on the host machine. So, in 2006, google added more features to an existing Linux OS kernel feature called namespaces.
Additionally, they implemented something called cgroups. Those two features enabled system administrators to run programs in isolation without hardware virtualization. That work serves as the foundation for container technologies. I would dive deeper into cgroups and namespaces later in this series.
So, with containers, cloud service providers don't need to provision a virtual machine for each customer. With the help of container technologies taking advantage of namespaces and cgroups, you can run your programs in isolated environments with a much lower resource footprint. Additionally, since containers are just programs with limited access to system resources, they have a faster startup time than virtual machines. This kind of efficiency translates to substantial cost savings on hardware for cloud service providers at the same time, allows them to provide cheaper offerings to you, the customer.
Follow the money, my friends; it will lead you to the answers to most modern tech questions. 🙊
Do Container Technologies Solve the: "But it works on my machine problem"
From my experience, yes, it could, but it takes some proficiency to containerize your application such that it works the same way everywhere outside your machine. In short, dockerizing your application does not automatically solve that problem.
The How
Talk is cheap; show me some code.
Prerequisite: Make sure you have docker desktop installed.
Step 1: Let's create a simple echo server in nodejs
// index.js
const http = require('http')
http.createServer((req, res) => {
const {method} = req;
if(method.toLowerCase() === 'get') {
res.statusCode = 200
return res. end()
}
const chunks = []
req.on('data', (chunk) => {
chunks.push(chunk)
})
.on('end', () => {
res.statusCode = 200
return res.end(Buffer.concat(chunks).toString());
})
})
.listen(5001, () => {
console.log('started server at localhost:5001')
});
To test that our echo server works, run node index.js
and then in another terminal, run curl -d 'hello world" loalhost:5001
. The server should echo back "hello world" to you.
Step 2: Write the requirements of your docker image in a Dockerfile
1 FROM node:20.9.0-alpine-3.18
2
3 WORKDIR /web
4 COPY ./index.js ./index.js
5
6 ENTRYPOINT ["node," "./index.js"]
Starting from line one
- We instructed Docker to pull nodejs 20.9 from the docker hub. If "docker hub" sounds foreign, consider it as GitHub for docker images. That nodejs image is the base(or foundation) to build our application image. A Dockerfile always starts with the
FROM
instruction. - Next, we define the default working directory of the container. The working directory is the default directory when your container starts. This is similar to the
$HOME
directory in the Linux operating system or the~
directory on Mac. - Next, we copy our source code into a docker image. And finally, on line six, we tell Docker the command to run when our application container starts up.
Step 3: Build the image using the DockerFile we just created; run the following command
docker build -f Dockerfile . -t node-echo`
The -f
argument specifies where the Dockefile is located, in this case the current directory.
The .
specifies the execution context of the build process, i.e., the folder where the source code(or executable) of your application lives. The command above assumes you're running the "docker build" in the same folder as your source code.
The -t
argument assigns a name to the resulting image.
To see information about the resulting image, run docker images
The output should be similar to what is in the screenshot below
Step 4: Create a running container with the image
docker run -t -i -p 5001:5001 node-echo
Breaking that command down,
- The
-t
flag allocates a pseudo-tty to the docker container. Assigning a pseudo-tty is a "Linuxy" way of saying, "We allow our program, in this case our container, to accept input from the keyboard. - The -i flag instructs docker that we want to keep the container's stdin open. For beginners, stdin is where your keyboard input goes to.
- The
-p 5001:5001
publishes port 5001 on the container to port 5001 to the host's operating system. That's also a fancy way of saying that we want users to be able to send HTTP requests to port 5001 in the container via port 5001 on the host operating system. This part is essential because, by default, everything running a container, including the network configurations of a container, is isolated from that of the host machine. To expose anything inside the container to the host machine, we need to let the docker engine know we intend to expose it. -
node-echo
is the docker image we are executing After running that command, our echo server should start up quickly. Sending a request to our server like so,curl -d "hello" localhost:5001
should send the word hello back to us.
Edit 1/10/2023
You can play with the full sourcecode here
Kubernetes
Kubernetes is a container orchestration tool for automating the deployment and management of containers. Usually, you don't deploy only one container in the modern web; you deploy tens to hundreds of them, and they have to work in a coordinated manner. This is the job of Kubernetes. I know those words do not hold significant meanings if you're a beginner, but this blog post is too long, so I'll pick up from here in the next part of this series.
Summary
In this first part of the series, we learned what a container is, why we need a container, the problems container technologies solve, and how you can create a simple nodejs server container using docker. We also introduced Kubernetes. We have just scratched the surface of this subject, so stay tuned for more parts.
Top comments (0)