Caveat Emptor - If you are concerned about the technical complexity of the article. Do not worry. I have tried to keep it jargon-free. So go ahead, enjoy and share your feedback.
Credits: Thanks to Prateek Reddy Yammanuru for insights and reviews.
In this article, we deconstruct some of the similarities and differences between two of the most popular Blockchain Platforms - Hyperledger Fabric v2 and Ethereum v1 (as of 10th Oct 2020). The motive is simple. We have much material explaining the workings of each of these ledger platforms. However, there still does not exist a deep-dive comparison of these ledger platforms. 🧐
Most ledger platform comparisons that I have stumbled discuss core features of the platforms like Consensus, Permissioning, smart contract language, and the likes. Personally, as a technology geek and a person who likes to dive deep and ask a lot of questions, these articles do not quench my thirst. 😅🍺
So, as an aspiring Software Architect, I decided to make an attempt to understand these platforms deep and explore the fundamentals that make these Platforms tick and what claims the fame of these ledger platforms.
I attempt to explore the architecture in four distinct areas. The overall functions that these ledger platforms provide, the architectural characteristics (also known as the non-functional requirements (NFRs)/ Quality Attributes / "ilities") that these functions are expected to exhibit, the architectural topology or simplified view, and finally some of the critical architectural decisions that were considered.
This will be a six-part series of articles where I will attempt to take each of the fundamental concepts of a Blockchain Platform and attempt to expound them as it is architected in each of the platforms with the rationale behind it. I will also try to provide an essence of the key architectural considerations made while designing the platforms.
I will try to explore the following areas, and in each, I try to cover three fundamental aspects. First is the need, second is key architectural considerations and finally the comparison. My goal is to be objective and non-partisan. For this, I will share the references to the sources from where I have obtained and compiled the information.
|Distributed Systems 101||Part 1|
|Overall Architecture||Part 1|
|Block Architecture||Part 2|
|Ledger & Storage||Part 2|
|Identity Management||Part 3|
|Transaction Lifecycle||Part 4|
|Smart Contracts & Ledger State||Part 4|
|Consensus Process||Part 5|
In this section, I attempt to break down the fundamental cogs that drive the Fabric Blockchain Operating system from the Ethereum Blockchain Operating system. Although mouthful it may sound, both of these platforms are irrefutably distributed in nature first - then decentralized. Hence all principles of distributed systems apply here. I believe it is fundamental to reason from the first principles of these.
If you are not too familiar with distributed system principles, I cannot recommend this article enough for a quick reckoner on Distributed systems.
But let me take a quick plunge into the fundamentals of distributed systems:
- Distributed systems are composed of nodes or computers that are capable of computing (calculating), storing, and transmitting data.
- These computers are Unreliable and so are the communication channel(s) that connect them. Any or both can crash or malfunction at any point in time.
- Distributed systems do not have a global clock. Ever tried setting two electronic clocks precisely to the second (do it with two computers to the millisecond).
- Distributed systems are complicated. One does not design a distributed system if there was a monolith computer with infinite compute and storage.
- All systems are eventually intended to cater to Functional and Non-Functional needs. Non-Functional needs a.k.a. Quality Needs. Basic quality needs of any distributed system include Scalability, Performance, Availability, and Fault Tolerance.
- For reliability and availability, one needs to replicate data. Where there is replication involved in distributed systems, the famous CAP theorem applies. It says that you cannot have all three properties in a distributed system (which replicates data) at the same time, i.e. Consistency, Availability and Partition Tolerance. Try achieving any two, and you have to trade-off the other.
Let us start with the needs of a Blockchain Platform, discuss the rationale behind the architectural characteristics.
Any blockchain system is fundamentally a distributed database in the most simple terms. It needs to be consistent, that means all data in all the computer nodes has to be the same exactly. Note here; I am not emphasizing on time here. Consistency can be immediate or eventual. To replicate the state of the database, State Machine Replication is used. There are two ways to do replication. Active Replication and Passive Replication. Active replication is where every database server processes the request in a deterministic way (i.e. everyone produces the same output on a single input). To guarantee that all servers produce the same output, it has to be ensured that all servers receive all the inputs in the same order. This is ensured by Atomic Broadcast. Atomic Broadcast says that the input should be received either by all servers or none.
Passive replication is more of a dictatorial in nature, which means that there is one master server executing the request and updates other servers with the execution result without blocking the client request.
Second-generation blockchain platforms work on the outcome of Smart Contracts, which is the custom business logic for determining the outcome of the transaction. However, there are subtleties involved here, and they are:
- Who needs to execute the smart contracts and why is there a need to agree on the outcome?
- What goes inside the ledger?
- Is the agreement on the smart contract outcome different from the State Replication itself?
Let us try to answer these fundamental questions subjective to each ledger platform.
Users who do not usually trust each other on the outcome of a transaction executes the smart contract. This helps one to verify the outcome is genuinely computed and has not been tampered. This dates back to the fundamental underpinnings of Bitcoin and cryptocurrency. Bank serves as the trusted arbiter to hold account balances for everyone. All banks universally operate on the same set of rules to maintain account balances. But if a nation or government or bank goes rogue, then what? This where secure computation comes into the picture. Smart contract execution ensures that every bank (hypothetically) executes the same logic and matches the outcome.
Note the subtlety in the last line? Every bank needs to execute the contract to agree on the outcome. So, does this mean that a civilian like you or me should compute the outcome for every other transaction? Probably no. Why should you be concerned about a transaction happening between other parties? But Bitcoin and Ethereum enforce this. It mandates that all participating nodes must agree on the outcome of the transaction, even if one is not involved in the transaction.
Moreover, there is not any need to bring store the transaction outcomes as well. This is the inception of the concept of Permissioned and private Blockchain platforms. So, if we delve into this a bit deeper, we would realize that:
- Not everyone needs to store all transactions, and others would not be comfortable with sharing the data.
- Even a lesser proportion of the population would require actually to execute a transaction and agree on the outcome.
- Comparing 1 & 2, we can realize that there would be people who would be observers who are interested in storing the agreed outcome but not executing the transaction itself.
So we can deduce that the state machine replication should be separated from the agreement on the outcome of the transaction by Smart Contract execution.
The ledger should be composed of all the outcomes of transactions checkpointed using a specific mechanism called blocks, unlike databases, where checkpoints are granular to the level of transactions itself. Checkpoints in Blockchain are Blocks which comprises the transactions.
The ledger then should contain all the delta updates (as transactions) that happened to the database checkpointed as blocks. Nodes in the network can either choose to store an irrefutable record of data and consent on the outcome of transactions (by executing smart contracts on the ledger data) or choose to store data as an observer, like a bank computer vs a bank vault. The ledger is merely the bank vault.
This is a question related to the previous one. And the answer is yes. Everyone interested to agree on the outcome of the transaction needs to have an irrefutable copy but so do people who are just interested in a copy of the ledger. So, the State Machine Replication is what guarantees that all interested parties would get the replica of the ledger.
Now, that we have established some fundamentals, we must realize that when we perform a transaction, we want both the steps to happen atomically (all or none). That means that relevant parties should agree on the outcome of the smart contract execution, and the appropriate parties should get a replica of the outcome of the transaction.
As we know, Ethereum is famous for both dApps (i.e. it's the capability to host decentralized apps, leveraged by smart contracts) as well as it's acclaimed cryptocurrency. The backbone of trust in Ethereum is its backed cryptocurrency that allows to codify game-theoretic security and innoculate trust in the dApps. Likewise, the backbone of trust in Hyperledger platform is the consortial identity binding.
In other words, the idea of trust is backed by some stake (like money, reputation, something that is dear to someone). In a public blockchain network which aims to be secure and conserve the privacy of the individual, money is at stake. In contrast, in a permissioned (i.e. known environment), the identity or the reputation is at stake.
So, in a public network, to maintain the cryptocurrency or to enable cryptocurrency backed trust, all participants must host the ledger and run the base cryptocurrency codebase. And since the cryptocurrency is not bounded, its scope mandates every participant to run all the smart contracts ever hosted on the ledger. This essentially means that if you want to do transactions on the ledger, backed by the crypto trust; you have to vet not only who owns how much cryptocurrency but the transactions that alter the balances of the cryptocurrency as well - in this case, smart contract transactions.
But in a permissioned network, participants are known to each other (i.e. the identity and reputation are at stake). So a rogue actor would think twice before ruining its own reputation.
This gives a differentiating (edge) factor to permissioned chains. In the early days, communities used to serve the function of law and order. This is because the individuals of a community were tightly knit within that community, and the transactions were strictly within the community. As Globalization became more prevalent and transactional boundaries widened, central law and order became the need of the hour. A similar concept applies to consortial and public blockchain platforms.
Let us explore how these distinct requirements shaped the architecture of both the ledger platforms.
This title "Order-execute vs Execute-Order-Validate" is borrowed from the official Hyperledger Fabric paper. Although I do not fully agree with the semantics, I still choose to borrow it for better relatability.
Let us break down each of these steps agnostic to any platform, and understand what these mean (not necessarily in that order).
This is the fundamental step that tells which transactions should come first and what comes later. As a matter of fact, this is a really difficult (maybe even impossible) problem to solve. Remember distributed system fundamentals? If we have to find an order, we have to find a way to measure order. Time is out of the equation first because there is no global clock. In Distributed systems (especially Decentralised systems) it is not possible to decide on the order of N transactions taking consent of all the stakeholders (or at least the majority). It would be extremely inefficient to get everyone to agree on the order of transactions. So who decides the order? Well, someone dictates, and everyone agrees. But who dictates and the legitimacy of the ordering comes at a cost (at least, and incentive in some instances) which we would discuss more in our Consensus discussion.
To decide on the next state of the ledger, one has to execute the transactions and derive on the next state unanimously. So, we saw that the order step has to order the transactions in a block. After executing the set of transactions in the next proposed block, everyone arrives at the next state (or the next block). Note that this transaction is sequential, which impacts the overall throughput. However, this can hardly be cited as a disadvantage because of the universality of the transactions and the ledger state.
This step encompasses the order and the execution step. When the transaction first is constructed, it has to be verified of the semantics, account balance, transaction fee paid (if applicable). Likewise, when the transaction is executed on the current state after the block is propagated, it has to be checked for proper state validations like dirty or phantom reads, race conditions, and header validations.
Now let us contextualize each of these steps for Ethereum and Fabric by examining the transaction flow of each of the platforms and earmarking each of the above steps in the overall flow.
Ethereum - Transaction Flow
The forward transaction flow involves the flow of the transaction starting from the client (here a mobile dApp). The immediate full node validates the transactions for necessary checks like checking the transaction nonce, checking sufficiency of the gas price (transaction fees), transaction semantics, etc. The transactions continue to move through the network in the transaction pool (with its brothers) from full node to full node until it stumbles across a miner. Keep a note that the transaction has not been executed.
Like this, the transaction reaches other competing miners too who also want to compete to mine the block. So one lucky miner who solves that PoW puzzle first gets to propose the block. I will discuss this in detail in the Consensus architecture post. However, three things happen at the miner and in order:
- Order - The miner orders the block based on a predefined strategy. Mostly this strategy is one that yields him the maximum revenue per block.
- Execute - The miner executes the transactions to ensure the transactions correctly apply against the current state (as perceived by the miner) of the ledger, and it generates logically correct the next state for each account (this includes Externally Owned Accounts and Contract Accounts). If some transactions do not generate logically correct next state, the transaction is dropped.
- PoW and Broadcasting - The final set of transactions obtained in the previous step is put in a block, the PoW nonce is computed, and the block is constructed with all relevant details and broadcasted to the nodes.
- Validate - Each transaction is again vetted against the current perceived state in each full node before it accepts the block. Conceptually, the full node has the liberty to reject the block, but this does not happen in practice, because the miner spends stake (compute) to mine a block, it is not rational for him to mine a block which won't be accepted by other nodes.
Why do I say perceived state? Because each full node can run independently on a different blockchain. But ultimately the chain merges using a protocol called GHOST (Greedy Heaviest Observed Subtree).
So if you observe the overall flow is not just "Order > Execute" but is "Validate > Order > Execute > Validate" considering all the nuances.
Hyperledger Fabric - Transaction Flow
Like we did for Ethereum's transaction flow, the transaction flow for Fabric can be decomposed into two parts. Before ordering (mining) and after ordering. In the first half, the client selectively broadcasts the transaction to organizations for endorsement (endorsing peers).
Few things to take note here. First, unlike the ocean of full nodes in Ethereum, the nodes who execute the transaction is known upfront. This means the rule to determine a transaction as valid, i.e. who is needed to endorse or approve a transaction is decided upfront. Parties not interested in the endorsement but just the transaction can only receive the block. So postponing the execute till the last is not an option. But the validation can be delayed because it is the ultimate step.
Each of the endorsing peer validates the Transaction Execution Proposal with authenticity and authorization checks. Then the transaction is executed on the current state of the ledger. Each endorsing peer generates signed "Read-Write Sets" as a response. The endorsing peers have the liberty not to respond too. The "Read-Write set" is that delta that adds on the current state of the ledger to determine the next state of the block. The onus of executing the actual transaction is only to the endorsing peers who want to execute and compute the next state of the ledger is actually on who is interested. So this sort-of-voluntary mechanism eliminates the need for any gas or transaction fees which was otherwise introduced in Ethereum to cater to the Halting Problem. However, the halting problem is still not fully solved. We will discuss more how the halting problem is further catered in the Consensus article. (sneak-peek: Fabric encourages each organization to host its own source code for the smart contract, thereby eliminating the need to trust smart contracts written by others. So, the more complex or bad code your organization develops, the more compute you spend).
In the end, the client collects these endorsement responses and sends them to the orderer along with the original transaction proposal to the orderer.
The client sends the transaction to a predetermined orderer or miner. Why is this the case? This is because the ordering is a crucial task. There is more accountability than reward here. Because of such high responsibility, Fabric introduces a separate concept (and status) of Orderer Organization who is identified as the organization authorized to order transactions of a consortium. Although we may argue about the incentive and stake in such a private blockchain network, we park that discussion for a different thread. When the transaction and other transactions reach the orderer, the transactions are ordered (based on the orderer consensus algorithm, like Raft) block can be formed based on some set parameters like Block size, block time and the number of transactions in a block.
Once again, the temporality relation of these transactions is entirely subjective to the orderer.
Next, these transactions are broadcasted to all peers, including ones who did not execute the transaction (committing peers). An endorsing peer is a committing peer too.
The transaction is then validated to check for sufficient endorsements (if the required stakeholder organizations have responded with the "Read-Write Set"), version checks against the current state (done through Multiversion Concurrency check) and most importantly, if all the signed endorsement responses (Read-Write Sets) are the same. This is the consensus on the next state of the ledger. There is no scope for inconsistent of ledger states here.
This is because, in a private environment, where you have already predetermined who would endorse the transaction and prevalidated the transactions, there is no second scope of perceiving different states of the ledger.
Once validated, the block is either appended if correct or rejected in full if the validation fails.
Moreover, since this entire process happens in a microcosm of the ledger called the channels, the peers, orderers and data can be scoped out based on the privacy needs of the consortium. Channels being the microcosm greatly facilitate the confidentiality and the performance of the ledger. This means every channel hosts its own smart contract, requires a subset of the endorsing peers which means unrelated parallel operations is possible.
Summarising, the overall flow is "Validate > Execute > Order > Validate" considering all the nuances.
Well, if you made it this far, I thank you and congratulate you for reading this article.
Thanks. Stay tuned for Part 2 where we dive deep into Block Architecture and Ledger Storage.