DEV Community

librehash
librehash

Posted on

Monero Wallet Backdoor Attempt

This was originally composed and published circa April/May 2021 on the Librehash blog and shared widely across several social media platforms; this piece served as the seminal entry point to a series of deep-dives and research into the Monero protocol and its shortcomings as a privacy coin (specifically, 'shortcomings' refer to vectors of compromise and in this context 'compromise' refers to transaction traceability)

URL of the Git Repo in question that got torn to pieces in this write-up = https://github.com/tevador/monero-seed

Before I go into the reasons for why I strongly believe that this is a backdoor, I'm going to start off by just pointing out a few thing wrong with this wallet implementation.

I'm doing this to establish that I did not arrive at this conclusion about this user's potentially nefarious intentions after misreading their potential incompetence, lack of qualifications for the project that they're attempting to undertake.

Also, since we're here, we might as well point out all of the facets of the wallet implementation that are inexcusably insecure

  1. "Embedded wallet birthday to optimize restoring from the seed" (this makes the wallet markedly more insecure then it needs to be by including unnecessary information that essentially leaks more info that should've otherwise been left private)

  2. advanced checksum based on Reed-Solomon linear code <-- this is another baffling break from convention here that's inferior to the standard used (CRC32) [this is the modern standard currently] / "Some file formats, particularly archive formats, include a checksum (most often CRC32) to detect corruption and truncation and can employ redundancy or parity files to recover portions of corrupted data." /// Not to mention that CRC32 is typically used for "digital networks and storage devices to detect accidental changes to raw data" ; this is the most likely vector of compromise for users in this context [attacker gains nothing by corrupting the underlying data]

Unexplained Departure From BIP32/39 Convention

The URL for the BIP39 construction: https://github.com/bitcoin/bips/blob/master/bip-0039.mediawiki

The link above has the original specification for BIP39 (mnemonics, which is what is the construction that this individual claims to be adhering to BIP39); evidence of this claim below from the user's repo:

Image description

Incorrect Mnemonic Word Count Selection

I'm not sure if this individual thought the mnemonic phrase word count was an arbitrary choice, but it isn't.

Entropy is used to generate a binary strong of 128 -bits (depending on the user's specific implementation). After appending the checksum (4-bits additional), the words are generated from the binary string - with strict mappings to the BIP39 dictionary.

This is mandated by the BIP39 specification (as seen below):

Image description

They use 14 mnemonic words (which deviates entirely from the normal convention).

They state that the phrase contains "154 bits of data", which are used for "future use", "wallet birthday', "128-bits for the private key seed", and "11 bits for checksum".

This is extraordinarily incorrect. Normally a BIP32/39 key is derived by:

  • Generating 128 bits of entropy (for example)

  • Hashing the entire 128 bits of entropy (with the chain's hash algo); then extracting the first 4-bits and appending that to the end of the 132 bits of entropy (that's the checksum; this person didn't even specify how they were going to derive the checksum)

  • Those 132 bits is supposed to be divided by 11 equal parts (resulting in 12 different "words"). Every 3 words = 32-bits of entropy; there are 4 groups of 32, which calculates back up to 128 (can't forget the additional 4 bits for the checksum).

Explanations For Decisions Make Zero Sense

Not only has this individual deviated significantly from the standard (which I had qualms with as is, but that's aside from the point).

Under the "reserved bits" section, they make statements that diverge from any logical cryptographic sense entirely.

I'm not sure why they're under the impression that bits can be "reserved" for some other purpose (what the fuck does that mean?). This entropy is part of the normal BIP39 construction that generates the entropy for the purpose of having users pipe the UTF-8 NKFD result into the PBKDF-HMAC-512 construction to derive a key.

There is no such thing as "reserving bits". Either bits are being used or they aren't.

Statements Made About the 'Bits' Reflect a Fundamental Failure to Understand EDDSA

The subheading for this section may seem a bit harsh, but its honestly an understatement at this point in time.

Also, what is meant by a "flag to differentiate betwen normal and 'short' address format"?

The public Monero address is a concatenation of the public spend key + public view key. They're both derived as two different valid points on the Edwards' Curve.

The publication, 'Zero to Monero' corroborates this as well:

Image description

There are two different sets of coordinates used because ed25519 keys can be used for both encryption and signing (unlike ecdsa / secp256k1). Monero takes advantage of this fact and does this by essentially granting individuals a public key that individuals can encrypt to (public view key) ; the public spend key is how they derive a subaddress based on what they know (the mission-critical nature of these addresses can't be understated since this all ties into the construction that Monero uses to protect against double spend attempts)

So I'm puzzled at why this user proposed having the "view key equal to the spend key"; not only does this make zero fucking sense, it also would make Monero exponentially less secure by orders of magnitude.

In fact, at a glance, this would entirely erode nearly all of the privacy on the protocol.

Where This User's Actions Make Me Believe They Are Setting Up a Backdoor

If you scroll down that GitHub page link that I provided, you'll see a header that says, 'Private key seed'.

This is the first spot where I observed that this user had reduced the security of the key derivation function (for no apparent reason).

See below:

Image description

Specifically they state:

"The private key is derived from the 128-bit seed using PBKDF2-HMAC-SHA256 using 4096 iterations. The wallet birthday and the 5 reserved/feature bits are used as a salt. 128-bit seed provides the same level of security as the elliptic curve used by Monero"

Ah! No! All of this is wrong & almost maliciously so.

Let's start from the top though.

This User Weakened the Key Stretching Function From BIP39

There is no conceivable reason for them taking this action (and this is actually something that would unique compromise users on the chain and I'll explain how).

First, let me establish that this individual did indeed weaken the strength of this key stretching function (PBKDF2-HMAC construction).

Below is a specification from BIP39 (Bitcoin):

Image description

Going further, if we look at the key length / bit / strength information provided by a matrix table from Wikipedia (these values were cross-referenced with the NIST specifications; this can be done independently by anyone reading along as well):

Image description

Please keep in mind that security in the sense of cryptography refers to the strength in 'n' bits (hence the security against collision attacks).

Therefore, the difference between 256-bit strength and 128-bit strength is a hell of a lot more than 2x; the difference is probably several hundred million times (and that's more than likely me low-balling it hard as fuck; i.e., 2128 vs. 2256)

Weird Nuance in the HMAC Convention Would Cause Collisions With Their Proposal

Since they elected to go with PBKDF2 - HMAC256 (vs. 512 variant), we need to take care to consider the length of the input being piped into this hash function.

Specifically, according to RFC2104:

"Keys longer B bytes are first hashed using H" [source = https://tools.ietf.org/html/rfc2104]

This, notably creates a seeming collision, where the sha256 of the input is the same as its HMAC (in essence); this essentially nulls the purpose of the HMAC in the first place.

This is detailed in this post here = https://mathiasbynens.be/notes/pbkdf2-hmac (the individual provided NodeJS + Python code for the reader to attempt to simulate on their own machines if they wanted to - very benign article)

It Appears That This User Neutered the HMAC Portion Entirely

This is a real problem at this point. And the omission of HMAC in this scheme the user is designing cannot be chucked up to ignorance or "not knowing".

Again, as shown above in the prior section, this user's reference to BIP39 shows that they have had exposure to the specification.

So there's no conceivable reason for removing the HMAC (key stretching) function from this scheme they're crafting.

Individual's Claim About the Strength of the Entropy vs. Strength of Curve Are 100% False

Under the same section as the dubious private key entry, the user states:

"128-bit seed provides the same level of security as the elliptic curve used by Monero."

  1. Monero doesn't use elliptic curves
  2. No it fucking doesn't

Entropy is not "security", nor is it ever factored into the bit-strength of the "elliptic curve" or the 'Edwards Curve' in this case.

These curves are geometric functions that depend on the assumed hardness of the discrete logarithm problem as their security assurance.

This is well known information...

Downgrading of Argon2 to PBKDF2

This one is inexcusable in any universe.

The commit was made on June 14th, 2020:

Image description

https://github.com/tevador/monero-seed/commit/f1c7829f043322849c0323f58aa0a46352de4e02

Visiting the commit directly, we can see this individual adding PBKDF2 to the project while simultaneously removing Argon2.

Image description

Image description

Image description

Nevermind the fact that this is oddly coded in 'C', which is a curious language choice for a simple library (there are many, much lighter weight and easier ways to implement this).

One particular qualm about 'C' is that its not memory safe ; which means that we aren't afforded protection from the program overflowing (which is more than realistic and, perhaps plausible, considering the input block length)

Explaining Why the Argon2 Swap Was Ludicrous

If you were to ask any security professional on planet earth what their opinion was on the best hash algorithm (for ensuring your information remains behind an impenetrable fortress.

For those familiar with mining, you may remember that hash algorithm - after all, its used in Monero.

Password Hashing Competition

Recently, there was an international competition that sought to find the latest and greatest in the world as it pertains to KDFs (key-derivation functions).

While said competition may seem a bit preposterous, it did indeed exist. And it was hosted, followed and adjudicated

Image description

source: https://www.password-hashing.net/

No Conceivable Reason For the Replacement of Argon2

Argon2 is exponentially stronger than PBKDF2.

Multiple experts in the field of cryptography and elsewhere have vouched for it as being the strongest password-hashing algorithm out there.

And we just so happen to find ourselves in a situation where we're deriving a key that we need to keep secret.

Which means that there would be no better KDF to use.

Example of How to Derive a Monero Address From an Argon2 KDF

There's live (open source) code on the internet that can be audited and/or compiled to test the veracity of this construction.

Any reader visiting the link will be taken to a cool module built from the WarpWallet principle utilized by Keybase.

Here's the URL = https://patcito.github.io/mindwallet

Image description

Image description

Address generation process is deterministic (just like ed25519), so that's good for this scheme. The entire code runs client side and is available for users to download at their leisure to deploy Golang / Python as your preferred language to interact with

Reed-Solomons Code Weakens the Mnemonic Selection

See below:

Image description

This should be wholly impossible if the key is constructed properly.

Ian Coleman's Library

Ian is one of the most prolific PoC composers this space will ever know.

And lucky for us - he has one for BIP39 as well, which should allow us to get a general gist of the security of this assumption

Below is a screen of the site:

Image description

source: https://iancoleman.io/bip39/

Notably, Ian's specifications for the mnemonic word options mirror the actual implementation too:

Image description

If we utilize the 'BIP39 Split Mnemonic' feature, that's when we'll see a concise estimate of how long it would take to break to crack a wallet where the user has submitted in some of their too much for a second ("cards")

Image description

Latest comments (0)