One day you got to work
All your buddies talk about this weird magical tech called "Docker"
You literally have no clue what the hell it is.
But you too scared to ask dumb questions.
There are 2 problems here:
- You're scared to ask questions
- You don't know what the hell they are talking about.
Well, you need some sort of inner realization to fix the first one but the second is on me.
In this post, I will elegantly jumble 30 hours of research into one blog post. No need to worry, turns out it's a lot simpler than I thought.
- A little bit about history
- Coming Back To Tech
- What can I put into containers?
- What is docker?
- Why do you need it?
- Why it works? (Separation of concerns)
- A Little Bit About Operating Systems
- But what about VM's? (Virtual Machines)
- Docker architecture
- Additional Resources
Imagine this, you recently opened a coffee bean business, and as it turns out your beans are the best in the market. You got your shop setup, and start selling those beans. You get good sales, and eventually, you start receiving orders from different cities. You start to think about how to deliver those beans to your customers, then it hits you.
Step 1: Get a bunch of trucks 🚚🚚
Step 2: Deliver ☕☕
Step 3: Profit 💰💰
A year passes by and your beans become more and more popular. You get your first order from a different country, but this time trucks won't work. They take too long and the road is bumpy, which may hurt the quality of your beans. You think once again... Then it hits you.
Step 1: Get a bunch of planes ✈️✈️
Step 2: Deliver ☕☕
Step 2: Profit 💰💰
A couple of years pass by, your beans are the most popular ones in the world. Your beans are being exported to all continents and countries in the world. You use all sorts of transportation method s which includes trucks, planes, ships, cars, helicopters, etc...
You might think all is nice and dandy but when you have so many methods of transportation, you have to make sure that they all don't mess up, meaning that it's very hard to manage that your coffee beans won't be ruined in different environments. It's a huge expense and headache for your business.
Great now you got a new problem, so as always you start to think, think and do more thinking until one day the "AHAAAA" moment hits and you figure it out.
Put a bunch of bean bags in a shipping container. Something like this:
Using this method, you can simply store your coffee beans in a container that will guarantee it's safety no matter what, may it be a ship, plane, car, truck, doesn't matter — it will always be safe.
Don't worry this won't take long.
The problem we just had with our coffee beans was a very common problem in the pre-60s. They worried about bad interactions between different types of cargo (e.g. if a shipment of anvils fell on a sack of bananas). Similarly, transitions between different modes of transport were painful. Up to half the time to ship something could be taken up as ships were unloaded and reloaded in ports, and in waiting for the same shipment to get reloaded onto trains, trucks, etc. Along the way, losses due to damage and theft were large. And, there was an X matrix between a multiplicity of different goods and a multiplicity of different transport mechanisms.
Fortunately, an answer was found in the form of a standard shipping container. Any type of goods, from pistachios to Porsches, can be packaged inside a standard shipping container. The container can then be sealed, and not re-opened until it reaches its final destination. In between, the containers can be loaded and unloaded, stacked, transported, and efficiently moved over long distances. The transfer from ship to gantry crane to train to truck can be automated, without requiring a modification of the container. This piece of technology revolutionized transportation and world trade. Today, 18 million standard containers carry 90% of world trade.
It can be said that in tech we had the same kind of problem because in the past the usual tech stack was very simple.
Deploying this didn't include going to hell and coming out of it. But as time progressed, our apps have progressively become a lot more complicated. Nowadays our apps are not a monolith, there may be a frontend and a separate backend. You might even have microservices with each service being on a separate server. Essentially what I'm saying is that:
Shipping code is so damn hard
There are a million things that can go wrong:
- Your customers OS is not compatible with the software.
- The server has missing dependencies.
- There are some bugs that can only be found on a specific OS.
- Two libraries have conflicting dependencies.
I could go on but for the sake of your time I will stop here.
In the context of our earlier problem with the coffee beans, can be translated to tech:
This problem is so popular both in tech and outside it that it has it's own special name:
The Matrix of Hell
In this example we have a huge product with many different components:
- Static Website
- Web Frontend
- Web Backend
- User Database
- Analytics DB
- Background Workers
You have to make sure that they all work on different platforms, and are all compatible with each other. So every time you make a small change to the dependencies, you have to re-test everything to make sure that you didn't have any bad side effects.
The QA Engineer will have to test each block separately and make sure that they all work, as well as after any dependency change, he's gonna have to retest everything. As you may assume this comes with many problems that include:
- Expensive; it's very expensive to maintain this kind of model because both the development and operations team will have to spend so much time on fixing conflicts in turn, more time equals more money wasted.
- Development Hell; Imagine your one simple change will break something that you didn't mean to touch in some platform. As a developer your gonna have this fear of making changes, with that comes very slow progress of the product.
- Slow development; Expensive + Development Hell = Never Finishing
You get it, it's very bad.
So once again the council of software developers have come together, and wanted to find a solution with the slogan:
Write Once, Run Anywhere
Then finally in 2001, a guy named Jacques Gélinas created the VServer project, which is the first version of the so-called Container.
Containers are essentially an entity that contains your application and all its necessary dependencies.
Containers are some very old technology already but where does docker come in?
Well, the thing is that containers are very hard to make and this is where docker comes in because what docker essentially is it's an abstraction that helps us to create and manage containers.
Basically, we have solved our problem (matrix of hell) by using containers.
Short Answer: Almost Everything
- A server or physical machine on which we can run our application.
- An environment setup on that server/machine. For example, if you want to run a PHP application, you might use LAMP (Linux, Apache, MySQL, PHP).
- Finally you need the source code of your application to run on the environment which you can also store on git-hub.
For different tech stack application, you might need a different environment
such as, for java application — JRE, for java-script application — node
the environment may be required, etc.
And it might be possible that you want a different version of the environment based on the compatibility of your application.
As we have stated above docker is simply a tool that makes it easier for us to create and manage containers. But let's state some more interesting details.
If you google docker you might get something like this.
Docker is a Dev-ops tool that automates the development, deployment, and running of applications inside isolated containers.
Docker was created by two guys called Solomon Hykes and Sebastien Pahl, it was actually an internal project in dotCloud.
It was created in 2010 and launched in 2011.
It is fully written in Golang.
Nowadays docker comes with the Docker Platform which consists of multiple products:
- Docker Engine
- Docker Images
- Docker Containers
- Docker Hub
- Docker Machine
- Docker Compose
- Docker Swarm
- Docker for Windows/Mac
Docker helps us in creating and managing containers in which helps us deploy our applications much faster, but there are some other use-cases in which docker can benefit us.
- Faster On-Boarding: Usually when a new developer joins a company, it's gonna take him some time to set up his development environment. This may take up to a day, and it can include some bugs which will have to be fixed. Some developers prefer using a specific OS, that may be incompatible with the company's product. What you can simply do is have a docker template for your company's product that new developers can simply download and be up and running in a span of 5 to 20 minutes.
- Environments: You can have different environments for containers on one server. For example, you can have one local container where you play around, one test container where you QA can play around, and finally, one prod container where your customers play around.
- But it works on my machine???: I'm sure that if you ever worked professionally you have experienced the dilemma of "It's working on my machine???". This usually happens because the environments for your local machine and the server of your product is different. Docker fixes this problem by putting your application in a container where it can be used everywhere as if you're using it in your local machine.
Let's just accept this fact.
Humans are terrible at multi-tasking, and keeping a lot of information in their brain.
In the pre-docker days the development team would create a project, then send the project (git repo) to the operations team and they would handle the deployment. But unfortunately the operations team does not know the product that well, as well as the technologies. The operations team goes back to the development team and asks for instructions on how to run the damn thing. The development team tells them that they have to install some dependencies, then run some commands to install some stuff.
It's very frustrating.
This should not happen.
The development team should only worry about the product.
The operations team should only worry about deployment and uptime.
Well this is all in the past now, developers put their applications on a container and the operations team simply takes the container, does not give a single shit about what's in their as long as it works.
Now both teams are happy.
Developers are doing only development work, they focus more on development which in turn will lead to a higher quality software, and the operations team will only worry about maintaining the container.
This is a prime example separation of concerns, and how it benefits us.
But before we move on about docker, we must first talk about operating systems.
Quick Disclaimer: I will mainly be talking about Linux operating systems.
Linux operating systems come in two parts:
- A set of software
- OS Kernel (Linux Kernel)
Operating systems such as Ubuntu, Fedora, Arch, and Redhat are essentially all the same. They all share the same OS Kernel (Linux Kernel), but what makes them unique is their different set of software. Ubuntu has some special UI or File System that Fedora may not have.
The key takeaways: (it will help in the next section)
- All Linux OS's share the same Linux kernel.
- What makes them each unique is their different set of software.
You might say "Why don't we simply use virtual machines instead of containers?"
This is a very valid question but the short answer is that it's not very practical to be using virtual machines.
How about we explore the differences between containers and virtual machines.
Containers are by far much smaller than virtual machines, the main reason why is that they share the operating system, while virtual machines have their own standalone OS.
Containers are much faster than virtual machines because there's no OS too boot.
I don't want you to think it's only one or the other, the majority of the time in big corporations virtual machines and containers are used together.
Docker comes in three parts:
- Docker Client
- Docker Host
Let's break these down one by one.
The command-line tool allows the user to interact with the daemon. A command-line interface client for interacting with the Docker daemon. It greatly simplifies how you manage container instances and is one of the key reasons why developers love using Docker.
The Docker client enables users to interact with Docker. The Docker client can reside on the same host as the daemon or connect to a daemon on a remote host. A docker client can communicate with more than one daemon. The Docker client provides a command-line interface (CLI) that
allows you to issue build, run, and stop application commands to a Docker daemon.
Docker host is actually a group of things:
- Docker Daemon
Let's once again break these down:
The background service running on the host that manages building, running, and distributing Docker containers.
Images are basically templates for a container, that has instructions on what kind of container it's gonna be. A good analogy is you can think of an image as a recipe but a container a cake. You can make as many cakes as you want from the recipe. Same thing with containers you can create as many containers as you want from a single image.
A container is a runtime object or representation of an image. aka instance of an image.
Docker implements networking in an application-driven manner and provides various options while maintaining abstractions for non-network engineers.
There are two types of networks available:
- Default Docker Network
- User-Defined Network
By default, you get three different networks on the installation of Docker – none, bridge, and host.
The none and host networks are part of the network stack in Docker.
The bridge network automatically creates a gateway and IP subnet and all containers that belong to this network can talk to each other via IP addressing. Basically what it means it allows multiple containers to talk to each other via a network. This network is not commonly used as it does not scale well and has constraints in terms of network usability and service discovery.
The other type of networks is user-defined networks. Administrators can configure multiple user-defined networks. There are three types:
- Bridge network: Similar to the default bridge network, a user-defined Bridge network differs in that there is no need for port forwarding for containers within the network to communicate with each other. The other difference is that it has full support for automatic network discovery.
- Overlay network: An Overlay network is used when you need containers on separate hosts to be able to communicate with each other, as in the case of a distributed network. However, a caveat is that swarm mode must be enabled for a cluster of Docker engines, known as a swarm, to be able to join the same group.
- Macvlan network: When using Bridge and Overlay networks a bridge resides between the container and the host. A Macvlan network removes this bridge, providing the benefit of exposing container resources to external networks without dealing with port forwarding. This is realized by using MAC addresses instead of IP addresses.
You can store data within the writable layer of a container but it requires a storage driver. Being non-persistent, it perishes whenever the container is not running. Moreover, it is not easy to transfer this data. In terms of persistent storage, Docker offers four options:
- Data Volumes: Storage that is persisted and is shared in multiple containers
- Data Volume Container: Storage that is persisted but can is shared to one container
- Directory Mounts: Simply have your local directory be shared inside a container
- Storage Plugins: Storage Plugins provide the ability to connect to external storage platforms (google drive, Microsoft azure, etc..) These plugins map storage from the host to an external source like a storage array or an appliance. A list of storage plugins can be found on Docker’s Plugin page.
Docker registries are services that provide locations from where you can store and download images. In other words, a Docker registry contains Docker repositories that host one or more Docker Images. Public Registries include Docker Hub and Docker Cloud and private Registries can also be used.
In conclusion, today we have learned about docker, why is it important, and how it works on a high level. If you have any questions feel free to leave them in the comments section below.
- And so much more because I wasn't keeping track.