DEV Community

Cover image for Finally Understanding Ethereum Accounts
Afri
Afri

Posted on • Updated on

Finally Understanding Ethereum Accounts

Ethereum is a public blockchain network that can be accessed with various types of accounts. Similar to Bitcoin, the underlying cryptography uses the SECP256K1 elliptic curve. But what does this mean? What is an account? What is a key? What is an address, and why is it checksummed? A crash-curse in Ruby and Crystal (using the eth gem and the secp256k1 shard).

Private-Public Keypairs

An Ethereum account is a SECP256K1 keypair. "SECP256K1" is just the name of the specific elliptic curve we are using. The name or specification of that curve is not essential to understanding how keypair-cryptography works, and there are many more curves with many different names and parameters.

A keypair contains a private and a public part. The private portion, also secret, or private key gives you access to the account. A private key is simply a number: 1 is a private key, 137 is a private key, 29863245 is another. You should not use any of them for security reasons as they are easily guessed, but as you can see, there is no magic behind a key at all.

# A private key is really just a random number.
# To generate a random private key that nobody 
# can guess in Ruby, get 32 random bytes.
require "securerandom"
secret = SecureRandom.hex 32
# => "bec52dffb33ec1f4d629f88232796e898f4294079c5894a6645e8a4f7261fabe"
Enter fullscreen mode Exit fullscreen mode

Understanding the public part, or public key, is more involved. It is a point on the elliptic curve with an x- and a y-coordinate; think (0, 1), (42, 138), or (34876, 4893). None of these are public keys on our curve, though. But you get the idea.

To retrieve the public point from the private number, you conduct an elliptic curve sequence multiplication of the used base point G and the scalar (your private key). The result is another point, your public key.

The asymmetry of the Ethereum account cryptography lies in the ability to prove with the private number that you can access the public point. In contrast, knowledge about the public key never allows to reverse-reveal the private key.

Therefore, keeping the private key secret gives only you and nobody else control over the public key of the Ethereum account.

# Generate a secure random private-public
# keypair using the `eth` gem.
require "eth"
require "securerandom"
secret = SecureRandom.hex 32
# => "bec52dffb33ec1f4d629f88232796e898f4294079c5894a6645e8a4f7261fabe"
key = Eth::Key.new priv: secret
# => #<Eth::Key:0x000055ae60f86d58
#  @private_key=
#   #<Secp256k1::PrivateKey:0x000055ae60f86a38
#    @data=
#     "\xBE\xC5-\xFF\xB3>\xC1\xF4\xD6)\xF8\x822yn\x89\x8FB\x94\a\x9CX\x94\xA6d^\x8AOra\xFA\xBE">,
# @public_key=#<Secp256k1::PublicKey:0x000055ae60f869e8>>

# The private key is just a number.
key.private_key
# => #<Secp256k1::PrivateKey:0x000055ae60e81458
#  @data=
#   "\xBE\xC5-\xFF\xB3>\xC1\xF4\xD6)\xF8\x822yn\x89\x8FB\x94\a\x9CX\x94\xA6d^\x8AOra\xFA\xBE">
key.private_hex
# => "bec52dffb33ec1f4d629f88232796e898f4294079c5894a6645e8a4f7261fabe"

# The public key is a ... point?
key.public_key
# => #<Secp256k1::PublicKey:0x000055ae60e813e0>
key.public_hex
# => "040f9802cc197adf104916a6f94f6c93374647db7a3b774586ede221f1eea92b11e02a4be750aa0fe9cf975cec1b69a222841648d4c2ced7b1d108a2c9723e89b8"
Enter fullscreen mode Exit fullscreen mode

Now, we used the eth gem to generate a keypair containing a private and a public key and we can easily see that bec52dffb33ec1f4d629f88232796e898f4294079c5894a6645e8a4f7261fabe is a number. It's the hexadecimal representation of the decimal number 86287827574830678407859947509786169732412250582090939460672560997304142789310. As I said, no magic. Notably, we choose such a huge number to prevent anyone guessing our key.

But what is the public key; it does not look like a point, right? That's because 040f9802cc197adf104916a6f94f6c93374647db7a3b774586ede221f1eea92b11e02a4be750aa0fe9cf975cec1b69a222841648d4c2ced7b1d108a2c9723e89b8 is a serialization of three fields: a prefix byte indicating the type of public key 04, the x-coordinate of the point 0f9802cc197adf104916a6f94f6c93374647db7a3b774586ede221f1eea92b11, and the y-coordinate e02a4be750aa0fe9cf975cec1b69a222841648d4c2ced7b1d108a2c9723e89b8.

That gives us the point (7053272788600477553676465022741516421197397404297740301327606102773230807825, 101392809526590995390445177899351250341338058145722255694397773992111072250296) on the elliptic curve. The prefix byte is not really used by Ethereum and only relevant for (un-)compressing keys in Bitcoin, so let's ignore it for the time being and assume it's always 04.

Now that we have a secret number 86287827574830678407859947509786169732412250582090939460672560997304142789310 representing the public point (7053272788600477553676465022741516421197397404297740301327606102773230807825, 101392809526590995390445177899351250341338058145722255694397773992111072250296), what's next? What can we do with this?

Addresses and Checksums

To recap, an account is a secret number giving access to a public point on an elliptic curve, easy as that! But now, we want to use some Ethereum blockchain functionality, such as transferring tokens or interacting with decentralized exchanges.

To receive Ether or any other asset on the Ethereum blockchain, you don't want to always tell your friends or family to send it to the coordinates (7053272788600477553676465022741516421197397404297740301327606102773230807825, 101392809526590995390445177899351250341338058145722255694397773992111072250296) which point to your spot on the curve where only you have access to. That would be slightly inconvenient.

Instead, we use addresses.

# The address of the previously 
# generated keypair.
key.address
# => #<Eth::Address:0x000055ae60fb8178
#  @address="0xc16fd2b4d06bcc9407b4b000b3085832f180f557">
key.address.to_s
# => "0xc16Fd2B4d06BCc9407b4B000b3085832F180F557"
Enter fullscreen mode Exit fullscreen mode

And address is directly derived from your public key: you remove the prefix byte, do one round of Keccak-256 hashing, and take the last 20 bytes. That's it, that's the address - 20 hexadecimal bytes of the hash: 0xc16Fd2B4d06BCc9407b4B000b3085832F180F557.

require "digest/keccak"

# Pack the public key nicely into a byte string.
public_key = [key.public_hex].pack "H*"
# => "\x04\x0F\x98\x02\xCC\x19z\xDF\x10I\x16\xA6\xF9Ol\x937FG\xDBz;wE\x86\xED\xE2!\xF1\xEE\xA9+\x11\xE0*K\xE7P\xAA\x0F\xE9\xCF\x97\\\xEC\ei\xA2\"\x84\x16H\xD4\xC2\xCE\xD7\xB1\xD1\b\xA2\xC9r>\x89\xB8"

# Cut off the first prefix byte.
public_coordinates = public_key[1..-1]
# => "\x0F\x98\x02\xCC\x19z\xDF\x10I\x16\xA6\xF9Ol\x937FG\xDBz;wE\x86\xED\xE2!\xF1\xEE\xA9+\x11\xE0*K\xE7P\xAA\x0F\xE9\xCF\x97\\\xEC\ei\xA2\"\x84\x16H\xD4\xC2\xCE\xD7\xB1\xD1\b\xA2\xC9r>\x89\xB8"

# Hash the coordinate bytes of the pubic point
address_hash = Digest::Keccak.new(256).digest public_coordinates
# => "_s\xB6RA\xD0r\xF8x\xE2\xD1\xC2\xC1o\xD2\xB4\xD0k\xCC\x94\a\xB4\xB0\x00\xB3\bX2\xF1\x80\xF5W"

# Only grab the last 20 bytes.
address_bin = address_hash[-20..-1]
# => "\xC1o\xD2\xB4\xD0k\xCC\x94\a\xB4\xB0\x00\xB3\bX2\xF1\x80\xF5W"

# Unpack the address and prefix it with `0x`.
address = "0x#{address_bin.unpack("H*").first}"
# => "0xc16fd2b4d06bcc9407b4b000b3085832f180f557"
Enter fullscreen mode Exit fullscreen mode

Note that by hashing the public key and cutting off the first 12 bytes of the hash, there is no way to go back and recover the public key from an address. The address is simply a placeholder on the blockchain for an account.

Only if you have a private key that maps to a public point that hashes to the exact address, solely, in this case, will the blockchain grant you access to any asset stored in the name of that placeholder on the ledger. How you can prove this cryptographically is beyond the scope of this article, and we might revisit this topic of signatures and transactions later.

One last little detail is left to investigate: Did you notice something unusual about the address? Right, key.address.to_s returned 0xc16Fd2B4d06BCc9407b4B000b3085832F180F557 which contains mixed-case letters. It's not only a 20-bytes hexadecimal string but also seems to feature random upper- and lower-case letters. Why is that?

This variance in the casing is the address checksum as per EIP-55: Mixed-case Checksum Address Encoding. To prevent typos or other mistakes while copying, pasting, or transmitting addresses, the hex-string has a checksum encoded on top of the address!

require "digest/keccak"

# Remove the address' hex-prefix.
unprefixed_address = address[2..-1]
# => "c16fd2b4d06bcc9407b4b000b3085832f180f557"

# Get the Keccak-256 hash of the 
# unprefixed address.
checksum = Digest::Keccak.new(256).digest(unprefixed_address.downcase).unpack("H*").first
# => "4bd92ec1770ff46b882ff0297df0ab4ee199a0b1947d3a378089e7127ca58d60"

# Map the checksum to address chars and 
# determine capitalization.
checksummed_chars = unprefixed_address.chars.zip(checksum.chars).map do |addr, chck|
  chck.match(/[0-7]/) ? addr.downcase : addr.upcase
end
# => ["c", "1", "6", "F", "d", "2", "B", "4", "d", "0", "6", "B", "C", "c", "9", "4", "0", "7", "b", "4", "B", "0", "0", "0", "b", "3", "0", "8", "5", "8", "3", "2", "F", "1", "8", "0", "F", "5", "5", "7"]

# Et voilà, une addresse.
checksummed_address = "0x#{checksummed_chars.join}"
# => "0xc16Fd2B4d06BCc9407b4B000b3085832F180F557"
Enter fullscreen mode Exit fullscreen mode

The checksum algorithm hashes the address again with one round of Keccak-256 and treats this hash as the checksum. Then, for each character in the address, we check if the corresponding hex digit in the checksum is either 0..7 or 8..f. If it's less than 8, we encode a lower-case letter; otherwise, we use an upper-case letter. That's it.

Smart-Contract Accounts

We understand that an address is just a placeholder on the blockchain for storing tokens or any other arbitrary asset that can be unlocked only by a particular private number.

But who controls the accounts of smart contracts? What is a smart-contract account in the first place?

A smart contract is, plainly speaking, just executable code. This code gets deployed to a smart-contract account, an address, a placeholder on the blockchain, where the private key is unknown.

This is intentional. If the secret giving access to a smart-contract account was known, you would never be able to guarantee that such a contract cannot be tampered with.

To determine a smart-contract address, you need two things: the sender account address and nonce while deploying the contract.

require "eth"
require "digest/keccak"

# Our sender address, an externally 
# owned account.
sender = "0xc16Fd2B4d06BCc9407b4B000b3085832F180F557"
# => "0xc16Fd2B4d06BCc9407b4B000b3085832F180F557"

# Our sender's nonce is `0` because we never 
# used this account before.
nonce = 0
# => 0

# RLP-encode both sender and nonce.
encoded = Eth::Rlp.encode [sender, nonce]
# => "\xEC\xAA0xc16Fd2B4d06BCc9407b4B000b3085832F180F557\x80"

# Apply one round of Keccak-256 hashing.
hashed = Digest::Keccak.new(256).digest encoded
# => ">0\x1D\xB1\xA4Lv\x04\xD3ul\x10\xA2sT\xDA\xD4\x9C]\x9C\r\x9A +d\xD6\x80\xC5\xFCN\xFC|"

# Again, only get the last 20 bytes 
# of the hash.
address = hashed[-20..-1].unpack("H*").first
# => "a27354dad49c5d9c0d9a202b64d680c5fc4efc7c"

# And that's the address.
address = Eth::Address.new("0x#{address}").to_s
# => "0xa27354dAd49c5d9C0D9a202b64D680c5fC4efC7C"
Enter fullscreen mode Exit fullscreen mode

The sender address and nonce are put into an array and RLP-encoded. RLP is Ethereum's Recursive Length Prefix serializer. The encoded RLP is then again hashed with Keccak-256, and the last 20 bytes of the hash are the expected contract account address.

Note that because we don't hash a public key but instead an RLP-encoded object, there is a certainty that we can not know the private key for the smart-contract account 0xa27354dAd49c5d9C0D9a202b64D680c5fC4efC7C.

What did I just read?

To recap, an Ethereum account is, just like many other cryptographic accounts (OTR, PGP, SSH), a private-public keypair:

  1. the private key is just a huge number, with 32 bytes of entropy.
  2. the public key is a corresponding point with x- and y-coordinates on the underlying elliptic curve (here: SECP256K1, same as Bitcoin uses).
  3. an Ethereum address is a truncated Keccak-256 hash of the serialized public key.
  4. the Ethereum address checksum is encoded by mixed-case letters in the address (EIP-55).
  5. no one owns smart-contract accounts, and their private keys are unknown on purpose.

Let me know if this makes sense in the comments below!

Further Resources

  • q9f/eth.rb: A Ruby library to build, sign, and broadcast Ethereum transactions.
  • q9f/keccak.rb: Ruby bindings for the Keccak hash (non-final SHA-3) used by Ethereum.
  • q9f/secp256k1.cr: An educational library implementing Secp256k1 well readable, purely for the Crystal language.

Top comments (4)

Collapse
 
ltfschoen profile image
Luke Schoen

have any Ethereum forks or other blockchain got away with doing anything sneaky to the source code for their accounts, such as omitting the RLP-encoding step?

btw typo: "Lenght" -> "Length"

Collapse
 
pbrudny profile image
Piotr Brudny

Great article. Code samples are really helpful. Thanks

Collapse
 
tannakartikey profile image
Kartikey Tanna • Edited

Hey @q9
How did you learn this? ETH docs? Nice article. Thanks

Collapse
 
niharbhagat profile image
Nihar Bhagat

Masterful article! Knowledge++