DEV Community

CoinMonks
CoinMonks

Posted on • Originally published at Medium on

Ethereum under the hood: Part 4 (The Trie)

Updates: Grammer fix

This episode assumes that you have gone through Part-1, Part-2, Part -3of this series, if you have not done so please do before you go through this article. The title of this episode is “The Trie” and not “Patricia Merkle Trie” (PMT). I feel it’s essential to understand about Tries first instead of jumping right into PMT. Let’s dive right in:

We are going to talk about:

  1. Trie and its applications?
  2. A more in-depth look into Trie’s
  3. How Ethereum relates to Trie
  4. A sample implementation
  5. What next?

Tip: I marked some words in bold to emphasize important points or get your attention.

Trie and its application?:

We start with Tries; there are numerous articles on Trie, feel free to wiki it. Trie is a data structure which is meant for fast re trie val and hence the word “trie”. A real-life use case for using a trie data structure is the auto-complete or auto correct feature while having chat session, not exactly like the one below :-)

Source:AutocorrectEpicFail.com

For the Auto-Fill or Auto-Correct, we need a quick lookup with multiple possibilities, storing and retrieving values for a given Prefix should be fast and this is why Trie is a useful data structure for these use cases.

A more in-depth look into Tries:

Have a look at the picture below which maps a given set of characters into a simplified Trie Data Structure for a given set of words.

Basic Trie

In our simplified Trie, each one of those circles can be considered a node with some properties. In our simplified trie, there are three kinds of nodes 1)The Root node(green) 2) Character node (blue) 3)Nil node (orange). Now lets circle back and add some properties we are familiar with from the world of Ethereum.

All of the node types (blue, green, orange) may have value(s) for one of the three fundamental properties:

  1. A Unique Id.
  2. Individual character and pointer[s] to children(s) and parent(s) node(s).
  3. State id(s).

The first two properties are self-evident, but property #3: State Id is an exciting and essential property, think of this as a marker in time, or simply put an ordering system which maintains a global state id, for e:g: State -> 0 is the initial state and the next State -> 1.

So let’s say the first word we want to add is “abba” and the second word we want to add is “abs” the data structure might look something like this:

{ word = “ abba ”, *state_id = 1 * }

{ word = “ abs ” , *state_id = 2 * }

Who decides what goes first and what goes next is a story for another day, we will come to that in later series, but hold on to that thought. With this in mind, we can have now had a simple data structure, something like the one below. For our simple trie. We are going to send three parameters to our simplified trie function.

Basic Trie

note: If you want to know more about trie, check out this excellent article.

How does Ethereum relate to Trie? :

If you recall RLP from the last two episodes Ethereum uses RLP to encode/decode values, e.g., a string “dog” encoded as [83 64 6f 67] and “dig” will be encoded as [83 64 69 67]. This encoded value search needs to be accessed quickly.

Note: “dog” converted to hex is 64 6f 67.

“dog” and “dig” depicted as our Simplified Trie

Looking at the above trie we can figure out about the data types, and we can determine the Data types, and how many of them are thereby looking at the left side of the trie, the number “83” represents data type String as specified by RLP encoding specification and there are two.

There are three primary functions applied to our Simplified Trie: 1) Add 2) Delete 3) Search a node. Check out the skeleton code for the Trie:

Trie Add function skeleton code:

_function_ **trie\_add** (trie, word, state_id){

//traverse through trie
     //add whatever character node not available in the trie
     //provide parent node id to child idto the new child node.
     //add 'nil' node if applicable and map it to state_id

return status;

}
Enter fullscreen mode Exit fullscreen mode

Trie search function skeleton code:

_function_ **trie\_search** (trie, word){

//Start from Root "/"
     //Retrieve all child nodes from "/"
     //verify if there is a match   
     //repeat until match completes

return status;

}
Enter fullscreen mode Exit fullscreen mode

Coming back to Ethereum, If you recall we talked about Ethereum World State, it is essentially hash of a key-value pair. Something like what you see below:

Source: Stack exchange

Imagine those nil nodes which acts as a terminator has a number like “45” , I am oversimplifying here, but you get the picture and when a lookup on the Ethereum Global State σ for key prefix: “a77d” , we get the values: {1.00 ETH, 0.12 ETH}. We can express this as follows:

[{ “a77d337”: “1.00 WEI” }, {“a77d397”: “0.12 ETH”} ]

The yellow paper defines this access as:

Section 4.1 From the yellow Paper, where a is the address, b is the balance

A sample implementation:

Below is a code snippet of Trie in Python (Thanks to Vijay), I would encourage you to make a simple application on your own.

Source: GitHub: @coolcalbeans

Output, Github: @coolcalbeans

Let’s stop here and take a breather.

What’s Next? :

I would recommend you to go through the references and get a sense of how Tries. In the next section, we will be talking about an upgraded version of the Trie called the Patrica Merkle Trie and its how it relates to Ethereum, till then keep on learning.

References:

Source: https://www.youtube.com/watch?v=sAErv97lfIM&t=276s


Top comments (0)