DEV Community

Emmanuel Gatwech
Emmanuel Gatwech

Posted on • Originally published at eman.hashnode.dev on

Understanding Docker and Containers

Introduction

Docker is a handy tool for managing application deployments. It's increasingly popular in organizations that need to deploy applications efficiently and securely in the cloud. If you're new to Docker, fear not this post will help you gain a practical understanding of its most commonly used features. There are many decisions to be made when using it, so this article will also serve as a guide on avoiding common mistakes and using Git more efficiently. Let's get started!

Docker provides an open-source platform for managing application deployments. It's a useful tool for enterprise IT organizations that want to deploy applications efficiently and securely in the cloud.

If you are new to Docker, it can feel like a daunting task: figuring out where to start, which concepts are essential, and how they all fit together. This article will help you gain a practical understanding of Docker's most commonly used features. A solid grasp of these concepts will help you avoid common mistakes and use Docker more efficiently.

Containerization

Containerization is the process of packaging an application and its dependencies (such as system libraries, config files, etc.) into a single unit known as a container.

The main benefit of containerizing your applications is that you can run them anywhere (on a server, laptop, etc.), regardless of the underlying operating system.

A container is a small independent program that runs in its own environment. Containers are lightweight and easy to create, modify and run. With containers, you can move applications between environments with greater speed and flexibility than traditional virtual machines allow.

The other benefit of containerizing your applications is that they allow you to build them quickly without worrying about configuring the operating system. Docker and containerization are two ways of providing isolated environments for your applications. The main benefit of containerizing your applications is that developers can run them anywhere (on a server, laptop, etc.), regardless of their underlying operating system.

Containers

A container is a small independent program that runs in its environment. Containers are lightweight and easy to create, modify and run. With containers, you can move applications between environments more quickly and flexibly than traditional virtual machines allow.

Containers are self-contained environments. They can be considered an operating system in their own right, or a sandbox separate from the host operating system. Containers have their filesystem, processes and network interfaces, and they can run on any machine with access to a Docker container's filesystem.

Containers are nothing more than a combination of a few Linux kernel features at their core.

Why Containers?

Containers are a great way to package and deploy your software. In this blog post, I'll discuss the merits of containers for deployment and best practices for using them on a modern Linux infrastructure.

Okay, so now at least you have a pool of servers responding to web traffic. Now you just to worry about keeping the operating system up to date. Oh, and all the drivers connecting to the hardware. And all the software running on the server. And replacing the components of your server as new ones come out. Or maybe the whole server. And fixing failed components. And network issues. And running cables. And your power bill. And who has physical access to your server room. And the actual temperature of the data center. And paying a ridiculous Internet bill. You get the point. Managing your own servers is hard and requires a whole team to do it.

Isolation

I've heard people call this "cha-root" and "change root". I'm going to stick to "change root" because I feel less ridiculous saying that. It's a Linux command that allows you to set the root directory of a new process. In our container use case, we just set the root directory where the new container's new root directory should be. And now the new container group of processes can't see anything outside of it, eliminating our security problem because the new process has no visibility outside of its new root.

Portability

Suppose you're migrating an application from one environment to another or need to run multiple versions of the same app side by side on different servers. In that case, using containers makes sense because you can efficiently run these apps inside other containers without requiring code changes or worrying about configuring the environment.

Security

Let's say you're running a big server that's in your home and you're selling space to other people (that you don't know) to run their code on your server. What sort of concerns would you have? Let's say you have Alice and Bob who are running e-commerce services dealing with lots of money. They themselves are good citizens of the servers and minding their own business. But then you have Eve join the server who has other intentions: she wants to steal money, source code, and whatever else she can get her hands on from your other tenants on the server. If just gave all three them root access to server, what's to stop Eve from taking everything? Or what if she just wants to disrupt their businesses, even if she's not stealing anything?

Your first line of defense is that you could log them into chroot'd environments and limit them to only those. Great! Now they can't see each others' files. Problem solved? Well, no, not quite yet. Despite the fact that she can't see the files, she can still see all the processes going on on the computer. She can kill processes, unmount filesystem and potentially even hijack processes.

Enter namespaces. Namespaces allow you to hide processes from other processes. If we give each chroot'd environment different sets of namespaces, now Alice, Bob, and Eve can't see each others' processes (they even get different process PIDs, or process IDs, so they can't guess what the others have) and you can't steal or hijack what you can't see!

There's a lot more depth to namespaces beyond what I've outlined here. The above is describing just the UTS (or UNIX Timesharing) namespace. There are more namespaces as well and this will help these containers stay isloated from each other.

Namespaces

So let's create a chroot'd environment now that's isolated using namespaces using a new command: unshare. unshare creates a new isolated namespace from its parent (so you, the server provider can't spy on Bob nor Alice either) and all other future tenants. Run this:

Resource Management

What is Docker?

Docker is an open-source project that provides tools for creating and managing application containers. It also enables users to package their applications into containers, which are then run in a virtual machine like an operating system. Docker also provides an API for interacting with the images to create these containers.

Docker is a tool for building, running, and deploying containers.

tldr: Docker does a lot more for you than just this like networking, volumes, and other things but suffice to say this core of what Docker is doing for you: creating a new environment that's isolated by namespace and limited by cgroups and chroot'ing you into it.

Suppose you're migrating an application from one environment to another or need to run multiple versions of the same app side by side on different servers. In that case, using containers makes sense because you can efficiently run these apps inside other containers without requiring code changes or worrying about configuring the environment.

Images

These pre-made containers are called images. They basically dump out the state of the container, package that up, and store it so you can use it later. So let's go nab one of this image and run it! We're going to do it first without Docker to show you that you actually already know what's going on.

Building Images

A Quick Note on COPY vs ADD

FROM node:12-stretch

USER node

WORKDIR /home/node/code

COPY --chown=node:node index.js .

CMD ["node", "index.js"]

Enter fullscreen mode Exit fullscreen mode

This is something very powerful about Docker: you can use images to build other images and build on the work of others.

Layers
Multi Stage Builds

Hey, we're already half-way to ridiculous, let's make our image EVEN SMALLER. Technically we only need npm to build our app, right? We don't actually need it to run our app. Docker allows you to have what it called multistage builds, we it uses one container to build your app and another to run it. This can be useful if you have big dependencies to build your app but you don't need those dependencies to actually run the app. A C++ or Rust app might be a good example of that: they need big tool chains to compile the apps but the resulting binaries are smaller and don't need those tools to actually run them. Or one perhaps more applicable to you is that you don't need the TypeScript or Sass compiler in production, just the compiled files. We'll actually do that here in a sec, but let's start here with eliminating npm.

Docker Hub

Containers 2

Networking

If you've been using Docker for a while, you've probably seen some things it can do.

One of the reasons Docker containers and services are so powerful is that you can connect them or connect them to non-Docker workloads. Docker containers and services do not even need to be aware that they are deployed on Docker or whether their peers are also Docker workloads or not. Whether your Docker hosts run Linux, Windows, or a mix of the two, you can use Docker to manage them in a platform-agnostic way.

This topic defines some basic Docker networking concepts and prepares you to design and deploy your applications to take full advantage of these capabilities.

Networking in Docker provides several benefits:

It enables a more straightforward way of provisioning applications on top of existing infrastructure rather than creating a new one from scratch.

It helps simplify your development workflows by removing the need for setting up networks manually every time you deploy changes through CI/CD pipelines or via manual deployment processes.

Data Persistence

So far, we have learned that containers are ephemeral, temporary, and disposable. Docker has a few features that allow use to make containers stateful.

Bind Mounts

Volumes

Bind mounts are great for when you need to share data between your host and your container as we just learned. Volumes, on the other hand, are so that your containers can maintain state between runs. So if you have a container that runs and the next time it runs it needs the results from the previous time it ran, volumes are going to be helpful. Volumes can not only be shared by the same container-type between runs but also between different containers. Maybe if you have two containers and you want to log to consolidate your logs to one place, volumes could help with that.

We can create volumes on a host system with different namespaces than the client applications running within them. For example, suppose you have an application running inside a container with access to some directory via a volume mounted from within the container. In that case, you can also use that same volume for external access without needing to change any files inside or outside the container itself. This makes it possible for many applications to share a single set of files without compromising security or performance due to conflicting paths between containers (e.g., one container having access to another container's volumes).

  • Volumes can be resized as needed. -Volumes support snapshotting to create backups of your data at any time.

  • Containers can be started from a volume, allowing for easy debugging and testing of code without starting containers from scratch.

Volumes are more efficient than bind mounts in terms of memory footprint and disk usage. Bind mounts require a directory entry in each container to access files outside the container's path. This adds complexity to your application's underlying codebase, which may introduce bugs or cause other undesirable behavior due to issues with file permissions.

Volumes are the preferred mechanism for persisting data generated by and used by Docker containers. While bind mounts depend on the host machine's directory structure and OS, volumes are completely managed by Docker. Volumes have several advantages over bind mounts:

  1. Volumes allow you to create a temporary file system independent of your host filesystem, which means you can persist data even when your host machine crashes or reboots.

  2. Volumes provide easy access to files from within a container. You can mount a volume directly into your container, allowing you to pull in files from outside of your container (e.g., from S3). This makes it easier to build robust applications that scale based on their runtime requirements.

  3. Volumes allow for quick removal of data once it's no longer needed so that resources aren't wasted when they aren't being used anymore (e.g., persistent volumes delete older versions of files).

Dockerize a Simple Node.js application

Explain the steps in the Dockerfile!

Development Tips

Conclusion

Whether you're considering using containers for your next project or simply want to get a deeper understanding of them, having this information at your disposal will help you make the right decisions regarding containerization. Thank you for reading this article, and good luck with your project!

Control Groups Chroot Other Stuff

Top comments (0)