Docker Swarm is a container orchestrator, meaning it allows you to manage a cluster of Docker Engines. The typical way to manage a single container on one host system is to use the Docker Command Line Interface (CLI).
However, managing one container on one machine is a pretty limited use of containerized applications. When you want to manage multiple containers deployed on multiple hosts, the CLI falls short, and it’s necessary to use a dedicated orchestration tool like Docker Swarm.
Docker Swarm facilitates multi-container workloads deployed on a cluster of machines, thus extending the capabilities of Docker containers. More specifically, Docker Swarm’s use cases include:
- Allocating tasks to container groups
- Managing the lifecycle of individual containers
- Scaling a cluster of containers up or down depending on workload
- Providing failover in the event a node goes offline
This article outlines some of the main Docker Swarm concepts in addition to detailing some best practices for this orchestration tool. For a more detailed guide, check out this Docker Swarm 101 wiki page.
A new tool used within Docker to provide functionality for running a cluster and distributing tasks. Docker Swarm mode uses Swarmkit libraries & functionality to simplify and secure container management over multiple hosts.
A swarm is a group of Docker hosts that run in swarm mode, operating as a single virtual host.
A node is an individual Docker Engine that is part of a swarm. You can many nodes on a single computer, but the more typical use is when nodes are distributed across multiple machines. Nodes can act as managers, leaders, or workers.
A task is the smallest scheduling unit in a swarm. Each task carries a container and the commands used to run it.
This type of node assigns tasks to worker nodes in addition to helping with the management of swarms.
One node is elected as a leader node using the Raft consensus algorithm. The leader node conducts orchestration tasks in the swarm.
Worker nodes simply receive and execute tasks designated by the manager nodes. Note that manager nodes function as both managers and workers by default.
Like any good tool, Docker Swarm’s usefulness shines when combined with a set of best practices that make the most of Docker Swarm’s capabilities and ensure the smooth operation of your cluster.
It’s important to establish a staging platform that runs as a replica of the production configuration you will use, with a cluster of Docker containers running in swarm mode. This way, you can prove that the container cluster runs in a stable manner and avoid potential production issues.
It’s imperative to closely monitor your Docker swarms for any potential issues that may lead to failures, including too much memory usage or network overloads. Good monitoring tools for container orchestration should provide the ability to apply custom rules, such as taking automatic actions which shut down nodes in the event of failures.
The way a Docker swarm operates is that you create a single-node swarm using the docker swarm init command. The single node automatically becomes the manager node for that swarm. The output area of the docker swarm init command displays two types of tokens for adding more nodes—join tokens for workers and join tokes for managers. It’s important to safeguard the manager token because once a node becomes a manager, it can control an entire swarm.
Sometimes a node becomes unavailable, meaning it’s a manager node but it cannot communicate with other manager nodes. It’s important to replace these nodes either by promoting an existing worker node to a manager, or adding a new manager node.
Either way, regularly run the docker node ls command from a manager node and look closely for nodes listed with an Unavailable value. You can identify worker nodes because their value on the output of this command is blank.
After you create a service that you want your containers to run, you can configure further settings for that service using the docker service update command. Specific options include setting the runtime environment for the service, reserving memory or CPU for a service, and choosing how the service should update. Note that in Docker swarm you either have global services or replicated services. A global service specifies a single service to run on the entire swarm, such as an anti-virus tool. A replicated service specifies a number of exact copies of the same task, such as three replicas of an Apache web server.
To better leverage the Raft consensus algorithm’s fault tolerance for your swarm, try to maintain an odd number of manager nodes at all times. An odd quantity of manager nodes helps to ensure a quorum remains for processing requests in the event of manager node failures.
If a node becomes compromised from a cyber attack, it’s useful to be able to remove it from the swarm without hesitation. In Docker swarm mode you can forcibly remove compromised nodes by using the following command:
$ docker node rm --force
You can combine this command with the specific node in question to forcibly get rid of a compromised node.
Docker swarm is a useful container orchestration tool that allows for effortless scaling of container clusters and the ability to manage groups of containers deployed on multiple hosts as a single virtual host.
By following the best practices outlined in this article, you can ensure you get the most from Docker swarm mode, and avoid potential performance and security issues.
Image Source: https://hub.docker.com/_/swarm/