Whether you’re aware of it or not, you use distributed systems every day. You’re using one right now, assuming that you’re reading this article on the internet. A distributed system is a system in which multiple devices are networked together without a global clock, and with concurrent components that can fail independently.
Over time distributed systems have grown more complex and demands for real-time concurrence have increased. These demands have spawned the development of new tools for storing data, such as etcd.
What Is etcd?
etcd is an open-source, distributed, persistent key-value store; a type of NoSQL database. It is used as the basis of many distributed systems. etcd was designed to fully replicate data and to be highly-available, consistent, secure and fast. You can use it on a variety of operating systems, including Linux, BSD, and OS X.
A key-value store enables you to store unique “individuals” and their associated data in individual documents. This is similar to a relational database. However, it enables you to more easily scale data since changes to one document do not affect the whole. For example, if one document requires a key-value pair that no other documents required.
In a traditional database, you would have to create a new column or row for this one value. In a key-value store, you can just add the key-value pair to the single document that needs it.
You deploy etcd as a cluster of nodes with communication between nodes managed by the Raft algorithm. This algorithm enables you to implement distributed consensus to ensure that all nodes maintain consistent information.
In an etcd cluster, nodes can exist in a follower, candidate, or leader state. Followers take information and report to the leader. Candidates ask to be elected as the leader when no leader is present. Leaders distribute information to followers and ensure consistency.
There are two stages that occur in ectd clusters, as explained below.
Leader Election
When an etcd cluster is initiated, all nodes start in a follower state. Each follower node is assigned a randomized time that it will wait to hear from a leader. If no leader is present, the first follower node to time out becomes a candidate and sends a request to the other nodes to be elected leader.
As long as a quorum (n/2+1) of nodes vote yes, the candidate is elected leader. Followers vote yes for the first leader request received. If a quorum can’t be reached, all node timers are reset and the process starts again. Once a leader is established, it sends a heartbeat signal to all follower nodes to indicate its presence.
Leader election ensures that there is only ever one leader node. It also enables the cluster to self-heal if a node fails. To ensure that a quorum can be reached and to provide a measure of resilience, it is recommended to use five nodes per cluster.
Log Replication
With etcd, all system changes must go through the leader. Changes can be submitted to a follower node but won’t be committed without leader approval. When a change is provided directly to the leader, the change is logged by the leader but not committed. This change is sent out to all follower nodes, which log the change and send confirmation to the leader. After a quorum of nodes send confirmation, the leader commits the change and notifies the followers to commit.
The first part of the process differs if changes are submitted to a follower or a follower regains connection to a leader. In the first case, the follower submits logged but uncommitted changes to the leader. The leader then follows the standard process. If a follower is newly connected or reconnected to a leader, it submits its current committed and uncommitted states to the leader. The leader then notifies the follower if the states match or what needs to be changed.
Log replication ensures that data remains consistent across nodes. It also enables the cluster to regain consistency after a loss of communications between nodes.
Using etcd in Kubernetes
The best-known instance of etcd use might be in Kubernetes. Kubernetes is an open-source container orchestration platform. It is a distributed system you can use to configure, deploy, and orchestrate containers. Kubernetes uses a master and nodes process, which operates within a cluster. Each node has a number of pods, which contain a number of containers.
You use etcd in Kubernetes as a backend for service delivery. It is one of the master components used for cluster management. You can run etcd in an external cluster or as a pod on your Kubernetes master. If you run it as an external cluster you can benefit from an extra layer of security and resiliency due to its isolation from the master.
Using etcd, you can store and replicate the state of your Kubernetes clusters. The state information you store includes cluster configuration, status, and cluster specifications. During deployment, nodes access etcd to learn how you want your cluster configurations maintained.
In production environments, you typically operate etcd using etcadm. You can also control it using etcdctl, a command-line tool.
Conclusion
When it comes to distributed systems, etcd provides scalability and flexibility that are not possible with traditional databases. While etcd may not always suit your needs, it is an excellent tool to be aware of. If, however, you are using Kubernetes, understanding etcd is a basic requirement.
Hopefully, this article helped you understand what etcd is and how it works. If you’re a Kubernetes user or considering becoming a user, this understanding should make learning to manage your system a bit easier.
Top comments (0)