DEV Community

Cover image for Reaching Consensus in a Distributed Network: The Byzantine Generals Problem.
Mitchyugan
Mitchyugan

Posted on

Reaching Consensus in a Distributed Network: The Byzantine Generals Problem.

Blockchain technology is ever evolving and many are adapting it in various fields and industries, but a known challenge in distributed networks which is provably unsolvable but can nonetheless be reduced, is how to reach consensus or an agreement, In a distributed network without authorities, we need a process to reach consensus about what is to be considered the truth. This is referred to as distributed consensus.

This problem is commonly known as the Byzantine Generals Problem. An overview of well-known blockchain solutions to this provides perspective and insight into the options the interchain offers because it allows a network designer to choose the consensus algorithm that is best suited to the developer's purpose.

The Byzantine Generals Problem

In the traditional description of this problem, generals whose armies are spread around a target city need to reach a consensus on a time to attack. The generals can only rely on unsecured communication channels to achieve this.
A lack of acknowledgment can either be caused by a failure to deliver a message, by a dead general, or by a failure to deliver the acknowledgment. While there are variations of the problem description to adapt to varying real-world fault-tolerance situations, most descriptions include an element of catastrophe if the generals fail to coordinate their actions.

ByZantine Fault tolerance Image

The practical challenge of blockchain reflects this hypothetical. Similar to the generals who must decide when to attack, the agreed transaction list in a distributed ledger has to be identified, and consensus on the correct order of transactions has to be reached.

Network failures in a Byzantine sense are:

  • Fail-stop failures: a node fails and stops operating.
  • Arbitrary node failures: a node fails to return a result, responds with an incorrect result deliberately or not, or responds with a result different from other results in the network.

When a distributed network reaches consensus even with nodes failing to respond or responding with wrong information (i.e. faulty nodes), it has Byzantine Fault Tolerance (BFT). BFT ensures that a distributed network can continue to work even with faulty nodes, as the consensus is a result of collective decision-making.

Reaching consensus

In addition to the chain of blocks, one of the innovations introduced by Bitcoin was to use Proof-of-Work (PoW) to reach consensus. Since then, many other consensus algorithms have been proposed and employed. note that there were many consensus algorithms before Bitcoin.

Many teams have experimented with distributed consensus using many algorithms deployed to different chains. The algorithms that exist can be compared in terms of the distribution of authority, expected performance, and transaction finality.

Transaction finality can be categorized as:

  • Probabilistic: It is exponentially more improbable that a seen transaction will be reversed.
  • Deterministic: The protocol prevents alterations in a definite way.

Recall that in blockchain individual transactions are sent to the network from individual nodes, Each node must pass transactions to other nodes. Because of the time delay required for data to travel across the network, not all nodes will see the same transactions at the same time.

Each node must therefore build its own order of transactions. Since all nodes participate equally, there is no authoritative order of transactions. Nevertheless, the network must decide which node's version (or any version) of the truth will become the authoritative truth, also known as the canonical chain.

There are several ways different blockchain networks are approaching this and we will go over some of the most popular consensus algorithms:

Proof-of-Work (PoW)

In a Prof-of-work (POW) consensus mechanism, nodes complete a task of arbitrary difficulty which when combined with ordered transactions in a block yields a hash function result that matches certain criteria. Finding such a solution is taken as evidence of considerable effort, meaning proof that considerable work must have been invested in the search, hence the name.

The network's nodes, aka "miners", conduct their searches independently. The successful node that announces a solution first receives an economic reward to encourage participation in the process.

If a node wishes to attack the network it must first have an authoritative position, This known attack vector is called the 51%-attack. As the name suggests, if a node acquires more than 50% of the total problem-solving capacity of the network, that node is theoretically able to alter the consensus.

The amount of work required to mine a block is called difficulty. PoW networks set a target average time for a solution to be found by any node on the network. The difficulty adjusts to compensate for increasing/decreasing total network problem-solving capacity. PoW networks do not get faster as more computing capacity is added. They become more resilient by increasing difficulty, which raises the threshold a 51%-attacker needs to overcome.

Proof-of-Stake (PoS)

Proof-of-Stake (PoS) is another method of selecting the authoritative node or validator for a given block, based on the assumption that those with the most to lose are the most incentivized to safeguard network integrity.

Validators must face a disincentive for bad behavior as well as a high probability that bad behavior will be detected. The burden of detection usually falls on the rest of the network validators, who can either accept or reject the validator's opinion. Validators place their staked funds at risk and face economic penalties for emitting opinions that are ultimately rejected by sizable numbers of other nodes, this also includes slashing off a percentage of its staked token and suspending it for a while. This makes a validator to generate blocks that are likely to be accepted by the network.

PoS validators are selected in a pseudo-random fashion, with a larger stake producing a higher probability of being selected to generate a block. While PoS systems generally reward validators with new coins for honest behavior, so-called block rewards, validators also receive transaction fees in return for generating blocks that the rest of the network accepts. A criticism of PoS is that it tends to centralize economic rewards in a small number of already wealthy nodes, those validators with sufficiently high stakes to win selection.

Delegated-Proof-of-Stake (DPoS)

An extension of Proof-of-Stake algorithms is Delegated-Proof-of-Stake (DPoS). It is used for example in BitShares, Steem, Lisk, EOS, and Tron.

In this type of consensus mechanism, so-called witnesses/validators are elected by the stakeholders of the network to secure the network, who "vote" by staking their own tokens upon a given witness/validator. Afterward, several witnesses are chosen for the block creation so that they represent at least 50% of the stakeholders' votes, thereby ensuring a majority consensus regarding new blocks.

Image of a Witnesses and Votes given to them

Witnesses/validators are paid for their services, receiving fees for creating and validating blocks; those who voted for an active witness typically receive a share in these rewards proportional to the value of the tokens they staked. As a result, the operation of DPoS networks tends to be more economically balanced than pure PoS.

Because the number of witnesses is limited, If a witness misbehaves, the network's community can withdraw their votes. Witnesses that no longer hold enough votes lose their income basis.

Alongside ascribing the role of witnesses to some participants, DPoS networks also elect delegates. Delegates are a group of participants that supervise network governance and performance, and propose changes that are then voted on by the entire network.

Many consider DPoS algorithms superior to PoW and PoS because of their fast block creation, high degree of security, energy efficiency, level of integrity, and democratic structure.

However, DPoS systems are less decentralized than PoW and PoS systems, because they have fixed validator sets and higher barriers to entry. In pure PoS systems, there is no fixed validator set and the number of potential validators that can participate in the consensus mechanism is dependent on the total supply of tokens in circulation.

Proof-of-Authority (PoA)

Proof-of-Authority (PoA) is a simple mechanism that solves faster block times by relying on validators that are pre-selected based on the belief that they are trustworthy. In other words, validators are allowed to produce blocks based solely on their pre-existing authority - they do not need to engage in the practice of competitively searching for a nonce, which makes PoW networks consume high energy.

This benefit of PoA networks is balanced despite the criticism that everyone needs to trust the PoA validators in order to trust the overall network and blockchain. Such centralization of authority might be considered contrary to the motivations originally driving the creation of a blockchain network.

Practical Byzantine Fault Tolerance (pBFT)

Practical Byzantine Fault Tolerance (pBFT) was first introduced in 1999 and arose from academia.

You can access the 1999 paper by Miguel Castro and Barbara Liskov on pBFT Here There is also an intuitive presentation you can look at.

In pBFT, a replica is a network node that maintains a full copy of the ledger state. pBFT is a three-phase protocol. Periodically, a "primary" replica is selected to create a new block, and it broadcasts a message containing an order of transactions to the other replicas for approval. When a threshold of agreement is reached, the message is verified. The broader network is informed about transaction blocks when they are finalized (i.e. "signed by sufficient replicas").

pBFT brings two main advantages to consensus on blockchains:

  • Energy efficiency: consensus is achieved without having to conduct expensive computations, such as in PoW algorithms.
  • Transaction finality: once a block is finalized it is immutable, as are all the included transactions.

This of course is a very simplified presentation!

CometBFT and consensus in the interchain

While most consensus implementations are tightly coupled with a particular blockchain project, the interchain focuses on simplifying the process of creating blockchains (among others) by offering a decentralized consensus mechanism called CometBFT.

CometBFT's engine runs its own public blockchain. It differs from Ethereum on its consensus protocol, as it uses the concept of validators who need to bind funds to validate, and who then validate blocks over the course of a certain number of rounds.

CometBFT offers the most mature Byzantine Fault Tolerance, transaction finality, and a great deal of flexibility - flexibility that is passed to the developers of custom blockchains built with the Cosmos SDK.

Further Reading

Castro, M. & Liskov, B. (1999): Practical Byzantine Fault Tolerance.

Castro, M. & Liskov, B.: Practical Byzantine Fault Tolerance presentation.

Curran, B. (2018): What is Pracictcal Byzantine Fault Tolerance? Complete Beginner's Guide

Kwon, Jae (2014): Tendermint: Consensus without Mining

Vasa (2018): ConsensusPedia: An Encyclopedia of 30 Consensus Algorithms. A complete list of all consensus algorithms

Vitalik Buterin (2016): On Settlement Finality

Witherspoon, Z. (2013): A Hitchhiker’s Guide to Consensus Algorithms. A quick classification of cryptocurrency consensus types.

Top comments (0)