Dave Cridland

Posted on Jan 18, 2018

Crypto Show And Tell: The Wonders of Diffie-Hellman-Merkle

#security

I know next to nothing about cryptography. I use cryptography a lot, and I'm fairly confident in knowing the properties of RSA and so on, and I've even designed authentication mechanisms. But how these things actually work? Yeah, I'm clueless.

So I embarked on an interesting journey of discovery. And I'd like you to join in. Today's lesson is in Diffie-Hellman (sometimes, and more properly, known as Diffie-Hellman-Merkle). This will be strewn with errors, because - as we've established - I have no real clue about any of this.

Step One: Think of a Number

Diffie-Hellman is a key agreement protocol - a way of getting two sides to agree on a particular value over an insecure channel without anyone else figuring out what it is. One you've done that, you can use the key to encrypt messages with AES, or whatever.

It's built around a Commutative Group, which is apparently a Thing in maths, consisting of a set where any two members of the set can be combined using some operation and result in another member of the set. In Diffie-Hellman, you can in principle use any Commutative Group that has an Operation that's hard to reverse.

Now, rather than describe this all with maths notation, which would be a nightmare to type into Markdown, I'm going to use Python. Python helpfully supports "big" numbers, and so this code will do production-strength DH - but you really don't want to try it at that level, so we're going to use much smaller numbers.

For classic DH, we first need some constants. These are the Diffie Hellman Parameters. We need a Generator (which is always a really small number, like 2 or 5) and a Prime (which is, well, a prime number - this should be really big, like 2048 or 4096 bits). These numbers needn't be kept secret, by the way - in fact they're normally shared on the wire in the first step.


G = 2  # This is fine.

P = 13  # Should be MUCH bigger.

Good start. Now, we need a way of getting a random number within the set. You'll get a pretty good idea of the set definition from this:

def gen_key():
    '''Generate a random integral number from 1 to P-1 inclusive.'''
    global P
    import random
    return random.randrange(1, P)

The fact we've called this gen_key might give you a hint - we use this to generate a DH private key.

Now, we just need the operation. This is really simple.

def op(a):
   global P
   global G
   return (G ** a) % P

Well, that's awesome. Raise G to the power of a, and then modulo the result by P (ie, divide by P and keep the remainder). Now you can make a private key with Ka = gen_key(), and then create the corresponding public key with Pa = op(Ka). Simples.

So what can we do with this? Well, not a lot, because we've got to make that op function a bit better - sometimes, we want to use something other than G.

def op(a, b=None):
   global P
   global G
   e = b if b is not None else G
   return (e ** a) % P

Now, watch this magic. Let's be Alice for a moment, and say we want to agree a key with Bob. Alice needs a private and public key.

Ka = gen_key()
Pa = op(Ka)

Bob has the same kind of pair:

Kb = gen_key()
Pb = op(Kb)

Alice sends Bob her public key, and vice versa. Now, the magic. Alice does this:

result = op(Ka, Pb)

And Bob does the reverse:

result = op(Kb, Pa)

Now result is the same on both sides.

The maths is really simple here, it's not really magic at all. No animals were sacrificed, no chalk markings are needed. You can light candles if you want, but they needn't be the black drippy kind.

Alice calculated this:

((G ** Kb) % P) ** Ka) % P

Which is the same as:

((G ** Kb) ** Ka) % P

Which is just:

(G ** (Ka * Kb)) % P

Because that's how indices work. Ka * Kb is, of course, the same as Kb * Ka, so Bob's calculation is the same. It's just that the inner term is the public key of the peer in both cases.

Neato!

Step Two: We're all going on a Summer Holiday

OK, so Alice and Bob can figure out a key together. (I mean, if they're really talking to each other and not to a MITM, but that's a whole different problem).

But supposing Alice is going away on holiday, and wants to redirect her messages to Carol? She could, of course, give Carol her private key. But that's rubbish, so instead she only gives Carol part of her key, and gives the other part to her server.

OK, this is going to get really weird.

So we have Alice's Public key, which is what Bob sees. Then Alice is going to have a Private key. Then Carol is also going to have a private key for Alice, and the server is going to have a Recryption key for Carol-as-Alice.

Ka = gen_key()  # Private, only Alice has this.
Pa = op(Ka)  # Public, everyone has this.

Kca = gen_key() % Ka  # Clamp it to Ka. Carol gets this from Alice.
Rca = Ka - Kca  # Yes, we're really just splitting it. The server gets this from Alice

Kb = gen_key()  # Private, only Bob has this.
Pb = op(Ka)  # Public.

OK, all setup. So now Bob wants to send Alice a message, and so agrees a key as before:

result = op(Kb, Pa)

Alice can easily enough get the same result. But her server does this before sending onto Carol:

half_result = op(Rca, Pb)

Carol then does the rest:

result = (op(Kca, Pb) * half_result) % P

And, amazingly, result is the same in both cases. Yet neither Carol, not the server, have Alice's private key. Eagle eyes will spot that it's trivial for them to collude to obtain it, of course - this is not a great Proxy Recryption scheme.

Unless it is.

Step Three: Desktop, Laptop, Tablet, Phone, Watch.

These days, we have an every increasing number of devices. I really do have all the above, all doing client-ish stuff.

The traditional method of key exchange involves either Alice and Bob exchanging public keys for all those devices (ick!) or else sharing the same keypair across all those devices (ick!), or else letting the server have the private key (ICK!).

Proxy Recryption is rather useful even in the simple form from above.

So Alice has a single keypair for all her devices, but only keeps the Private key on her most trusted device, maybe even on a smartcard or something. The Public key she uploads to the server.

For each device, Alice generates a Recryption key and corresponding Private key, uploads the Recryption key to the server, and transfers the Private key to the device.

Now a message Bob sends to the server can be fanned out to each device, with no shared keys, and the server doesn't have the means to decrypt. Any device could collude with the server to get Alice's private key, of course. But that's acceptable because she owns the devices and already has the private key.

Step Four: Profit

None of this is covered by patents, as far as I can tell. Not that you should take my word for this. If you want to do DH in production, though, remember:

You want to use HUGE PRIMES. OpenSSL has some tools to find suitable parameters for you.
You probably want to look at ECDH, the Elliptic Curve variant of DH. But I have no idea how to do Proxy Recryption there - it is likely to be just as simple though.