Pankaj Tanwar

Posted on Apr 30, 2021 • Edited on Sep 3, 2021 • Originally published at pankajtanwar.in

How to implement WhatsApp like End-to-end encryption?

#systems #webdev #programming #javascript

This article was originally published by me at https://www.pankajtanwar.in/blog/how-to-implement-whatsapp-like-end-to-end-encryption

One of the WhatsApp key feature is end to end encryption of messages. In this article, we will quickly understand a simple implementation of WhatsApp like end to end message encryption.

What is End to end encryption 🔒

If we take example of WhatsApp, the concept of end to end encryption is that a message sent from user A to user B over WhatsApp, can only be read by user B, no one else can read that in between, not even WhatsApp servers.

The e2e encryption implementation which we are going to discuss here is not how exactly WhatsApp or signal app does, It's is just to give you a brief idea of, how we could do that!

First, we will go through the typical chatting application architectures implementation to understand their drawbacks.

Implementation 1

Raw text is sent via HTTP request to the server, it is saved in the DB and sent to User B via HTTP.

Pros -

Super simple to implement

Cons -

No end to end encryption
Content is being sent in plain text format. Highly vulnerable to man in the middle attack. An attacker can sniff into the network and see the content or worse can alter it.
Content is saved in plain text on servers. If servers are hacked, your data can be compromised.

Implementation 2

Raw text is transferred over HTTPS to the server.

Pros -

Man in the middle attack is not possible as we are transferring data using HTTPS so TLS will encrypt the data over the network.

Cons -

TLS termination happens at load balancer level and data will be decrypted at backend server and servers can read your messages as data is saved in plain text.

Implementation 3

To implement end to end encryption, we use a method called Public key cryptography.

In Public key cryptography, every user has two keys, Public key and private key. As their names suggest, one user's public key is visible/accessible to all other users and private key is very private to the user. It is saved locally on the device and can be accessed by self, not even by backend servers.

The concept of public key cryptography technique is -

Data, encrypted using any user's public key, can only be decrypted by the same users' private key.
Data, encrypted using any user's private key, can be verified by the same users' public key

For example, if any message is encrypted using public key of user A, then it can ONLY be decrypted by private key of user A. and if any message is encrypted with private key of user A, it can be verified by public key of user A.

The very important thing of note here is any message encrypted using public key of user A, can not be decrypted by same public key. It can only be decrypted by private key of same user.

We will use this technique for our end to end encryption implementation.

let's say A wants to send a message to B.

We have access to public key of user B.
So, we will encrypt the message with public key of user B.
Now, we know that it can only be decrypted with private key of user B and only B has access to his private key.

So, this is how we can ensure that not even backend server can decrypt and read the message.

One more important thing, when message is received by user B from user A, there must be a method for user B to verify that this message was actually sent by user A (as everyone in the network has access to public key of user B, any one can send the message to him).

To ensure it, we use a "Digital signature".

Here we use, second concept of public key cryptography.

When user A wants to send a message to user B, along with the message, he adds a tiny Digital signature encrypted by his private key.
When user B receives the message, he can verify the digital signature with public key of user A to make sure the message was actually sent by user A.

Hope, it gave you a fair idea of how we can implement end to end encryption.

✍️ Take Home Assignment : How can we implement End to End Group Messaging encryption using public key cryptography? Let me know in the comments!

References : Public Key Cryptography, Public and Private Keys

Thanks for reading.

If you liked the article, give me a cheeky follow at twitter.

Top comments (10)

Andrei Gatej • Apr 30 '21

Thanks for sharing, I found it very helpful!

I have a few questions.

It is saved locally on the device and can be accessed by self, not even by backend servers.

What can be done in case of browsers?

The second question, what would happen in case of a group chat? I’m thinking that a pair of keys would be created for the entire group, but then I’m not sure what it would happen.

Thank you!

Pankaj Tanwar • May 1 '21

Hi Andrei, Glad that you found it helpful. Thank you.

Q 1 - What can be done in case of browsers?
In case of browser, it can be saved in local storage or cookies.

Q 2 - what would happen in case of a group chat?
"Creating a pair of keys", can be an option here we can have these keys, shared with every user of the same group. But again a new interesting question arises, every time a user joins a group, how to send these shared keys to new user? If we send via server, means server has keys in logs and your messages can be read!!

what are you thoughts?

Andrei Gatej • May 1 '21

Thanks for the reply.

Referring to the first question, so that's why when you want to use WhatsApp Web you have to scan that code using your phone? So that it can transfer the private key from the device into LocalStorage, right? I've always wondered why I had to always use my phone in order to use the web app :).

Regarding the second question, which is indeed interesting, I was thinking of this approach: when a user joins a group, the user will have a pair of keys. Then, knowing that the public key of the group can be accessed by anyone(including the server), we can use an existent member of the group which has the private key of the group on their own device and encrypt that with the public key of the newly registered user. Then, that encrypted message will be received by the new user, who will be able to decrypt that since the message had been encrypted with their public key. So, now the new user will have the private key of the group as well.
What do you think of this approach?

Pankaj Tanwar • May 1 '21 • Edited

Thank you, Andrei. I am not sure if WhatsApp uses this technique for transferring keys to the browser. I have read somewhere that, for every new message, WhatsApp generates a new pair of keys and a lot of other fancy things happen in between (like Diffie-Hellman key exchange).

On the other note, telegram does not require to use phone for web telegram. How it generates keys then? There must be some interesting behind it!

Your approach for second questions is brilliant. Highly efficient with minimal data transfer. But again one more question here, let's say 5 people are there in a group, I clicked on "Join Group" and if all 5 people are offline, who will transfer the keys? What do you think? Am i missing something here?

Andrei Gatej • May 1 '21

Never head of Diffie–Hellman before, thanks for mentioning it, this discussion made me want to explore cryptography more in depth in the future.

Yeah, there must be a lot of interesting details behind the scenes. One of my career goals is to work on projects of such scale, imagine how many cool things one could learn!

Regarding the last paragraph, that is a very good question. I'm not sure about this approach, but since there is no other active connection from an existing member(which basically means that the new user is alone there), I suppose we could still encrypt messages with the public key of the group and temporarily store these encrypted messaged on the server. Then, when any of the existing members(apart from this new one) comes back online, we can now:

send the temporarily stored messages to the already existing members which are online, and they can decrypt them since they have the private key of the group
apply the same logic as if there was at least one active connection when the new user joined, so now they will have the initial private key of the group

I guess this explains why as a new member of a group, you can't see any of the group history: messages, photos etc, because if you're a new user and there is no other existing member online, you can't get the private key of the group immediately, so you can't see the history of that group. What would you say?

Pankaj Tanwar • May 3 '21

That's a really smart approach, Andrei. We can store the message temporarily on the server with the public key and everything works smooth and that's why we are not able to see the previous message/history of the group.

But let's say, WhatsApp wants to add this feature of showing history of the group too, when a new user joins the group. How would you go about this? I could not think of a work around for it. Would you like to add you thoughts here?

Andrei Gatej • May 3 '21

I don't think there is a way to solve this with the current approach. That's because if a new member joins and none of the other members is active, then it's impossible to get that private key of the group, so you can't decrypt the messages.

Moreover, I read that Discord does not use E2EE, so this might be a reason why you can see previous messages when you join a group there.

Pankaj Tanwar • May 18 '21

Yes, Discord & Telegram has developed their own smart algorithms to deal with such use cases. Do you have any documentation or article related to Discord's implementation for this?

Andrei Gatej • May 20 '21

Sorry for the late reply. No, I just did a quick search to see whether Discord is using such feature or not. But I'd be glad to read more about it too.

Pankaj Tanwar • May 22 '21

I am also searching a bit. I will let you know if I find something. Thank you for such a useful conversation. Hope to learn more from you.