Sun-Li Beatteay

Posted on Jan 6, 2021 • Edited on Jan 9, 2021

How Prime Numbers Keep the Internet Secure

#programming #javascript #computerscience #algorithms

Note: I spoke about this topic at RubyConf 2020. If you would like to checkout the video, you can see it here:

Whether you know it or not, you use prime numbers every day. Do you see that lock symbol in the address bar of your web browser? The one that looks like this:

That lock means you’re using prime numbers at this very moment. That’s because the internet uses prime numbers. In fact, prime numbers are so ingrained into the fabric of our daily lives that the world would be a drastically different place without them. We’d still be doing all our banking in person and buying everything in cash. And forget about texting, because we’d still all be pen pals.

So what is it about prime numbers that make them so special?

Firstly, they’re unique. There aren’t any combination of numbers that can be multiplied together to create a prime number.

Secondly, every number can be broken into it’s prime components. For example, 10 can be broken down into:

10 = 2 * 5

Lastly, while the average human might not be able to look at this number and immediately detect if it’s prime …

9307398526401816703683197763617206082269079617576835286211259044095385462270542532346398139788788003092515521098292832872130802035097419307557532476688659

… it’s relatively simple for computers. You might’ve written your own primality checker in the past, similar to this …

// shout out to Reddit for the correction
function isPrime(a) {
    const n = BigInt(a);
    const zero = BigInt(0);

    if (n <= BigInt(3)) {
        return n > BigInt(1);
    } else if (n % BigInt(2) === zero || n % BigInt(3) === zero) {
        return false;
    }

    let i = BigInt(5);
    while ((i*i) <= n) {
        if (n % i === zero || n % (i + BigInt(2)) === zero) {
            return false;
        }
        i += BigInt(6);
    }

    return true
}

… and know that many conventional methods for checking prime numbers is slow. But there are more advanced methods, such as the Miller-Rabin primality test, that make it very fast.

It’s for all of these reasons that prime numbers are the perfect tools for encryption!

Encryption

For those who don’t know, encryption is the act of turning information into an unreadable format called a cipher. Decryption is the opposite process of turning a cipher back into the original information.

In other words, encryption allows us to keep information private and out of the hands of people who might use it for malicious purposes. That’s why it has become a cornerstone of the modern internet.

Without encryption, I wouldn’t be able to do most of the things I do online, such as buy groceries, pay off debts, or message my friends — at least not securely. Encryption prevents hackers from stealing my banking information and spying on my private conversations.

It’s not just the internet that uses encryption but many modern devices, such as computers, smartphones, or even smart fridges. They all use encryption. Suffice it to say, encryption is important and everywhere.

But how does encryption work?

Encryption algorithms use keys to encrypt and decrypt data. How those keys are used depends on the type of encryption, of which there are two: symmetric and asymmetric. Both of which have different use cases.

Symmetric encryption

Symmetric encryption gets its name because it uses the same key for both encryption and decryption. Since it uses a single key for both encryption and decryption, symmetric encryption is very fast — but fragile. The key must always be kept private and only shared between trusted parties.

Because of this, one of the main uses for symmetric encryption is securing data at rest. This means encrypting devices like computers, databases, or IoT devices. If you remember the drama that occurred between Apple and the FBI — that was a battle over iPhone encryption.

While symmetric encryption works well, it has an inherent flaw. In order for multiple parties to have encoded communication via symmetric encryption, they must all agree on a key ahead of time. And in the context of the internet, where you’re communicating with hundreds of servers a day half-way across the world, that’s not possible.

That’s where asymmetric encryption comes in.

Asymmetric encryption

Asymmetric encryption uses two keys, one for encryption and one for decryption. This works because the keys are complements of one another. When they’re used together, they cancel each other out — similar to how complement colors cancel one another out into white.

Correction on the above image: Cipher should be orange

The key used for encryption is known as the public key. As you might guess, it’s safe to share this key with anyone.

The decryption key, on the other hand, is called the private key because it must be kept private. Only the holder of the private key can decrypt ciphers that were encrypted with the public key. Even if a malicious user were to intercept a ciphertext, they’d just see gibberish.

This makes asymmetric encryption an ideal tool for sharing sensitive data. Not only that, but since a private key should only be owned by a single entity, it works well for authentication as well. That’s exactly how it’s used in the TLS handshake.

The trapdoor

One of the reasons that asymmetric encryption is as important as it is is because it works as a trapdoor function.

This means it’s very simple to execute in one direction but very difficult to reverse — unless you have special information, otherwise known as the trapdoor or secret.

In the context of asymmetric encryption, it’s very simple to encrypt data but very difficult to decrypt it using only the public key. It becomes simple again with the private key.

But not all asymmetric-encryption algorithms are built the same. How laborious it is to reverse the trapdoor function determines an algorithm’s security. To see just how secure asymmetric encryption can be, let's explore one of the most popular algorithms in use today: RSA.

RSA encryption

RSA was invented in 1977 by three cryptographers: Ron Rivest, Adi Shamir, and Leonard Adleman — hence the name. Since its inception, it has spread to nearly every corner of the earth.

If you’ve ever used Secure Shell (SSH) …

… or GNU Privacy Guard (GPG) …

… you have RSA to thank for it. However, it’s most known for its use in TLS and HTTPS to prevent man-in-the-middle attacks.

While RSA is nearly half a century old, it’s one of the most commonly used asymmetric-encryption algorithms in the world. Its ubiquity is a testament to its security.

But why is it so secure? Short answer: prime numbers. Long answer? That’ll involve some math. But the best answer would be to try and break it ourselves.

Breaking RSA

Here’s the scenario: We’re hackers trying to impersonate Medium’s server. We want to intercept all traffic going to Medium’s website in order to steal user credentials and ransom their data.

Using Wireshark, we’re able to get a copy of Medium’s RSA public key and website certificate.

But in order to impersonate Medium and fool users into connecting to our phishing server, we need the private key. Luckily, all is not lost.

One thing I haven’t mentioned is that RSA keys are just numbers. An RSA private key is just a single number, which we’ll call d. The public key is made up of two numbers, e and N. And N is the product of two more numbers, p and q.

I know, that’s a lot of numbers to track of. But it’s just those last two numbers, p and q, that we need to focus on. Because according to RSA’s key-generation algorithm, if we know e, p, and q, we can recreate the private key.

“Well, perfect,” one might say. “Since we have the public key, we know e and N. And since we know N, we just need to split it apart to get p and q. How hard could that be?”

Not so fast, person I just made up to ask loaded questions — p and q are prime numbers. Gasp!

I mentioned before that detecting that generating prime numbers and checking if they’re prime are relatively simple for computers. However, what isn’t simple is prime factorization.

How hard, you might ask?

RSA typically uses numbers 1024, 2048, or 4096 bits long. As you can see in the graph above, it only takes seconds to minutes to create N, but it’d take millions to billions of years to factor it apart.

The reason for this is — for average, nonquantum computers — there isn’t a fast method for factoring a number into its prime components. One of the best methods we know is the Number Field Sieve, but even then, for a number like this, it will take a while:

12647218591793774062037539860814590913847656969568852342569985866826731647633698490555162899129013020883082990527279827064849704038819915244363097120031062841681483530795022535252488366169730386558454292994968234214045666016756933262308367238453012386845278265898125397947728757013541963782671274800429212175737617916738370351721854897974375037404102868790995317383226110430324268401945063200233204784127599950729869495397377610047121343931821194220803396259107891220452870079636709770538139479748696178546655932056530040495898965404702415803790560056325250086900175615221136804225865647753477561884491932551643726743

While it’s not impossible, the level of effort is astronomical and not worth it. We’d all be long dead by the time we could generate Medium’s private key.

So long story short, prime numbers are pretty darn hard to break. And that’s how they keep the internet secure.

Parting Thoughts

As a software developer, I’m often intimidated by all the different moving parts on the internet. It can feel like a magical and bewildering place. And as a result, I usually feel like I have no idea how any of it works or what I’m doing.

But any time I learn something new about the systems I use on a daily basis, the world becomes just a little less chaotic and magical. I hope this article has helped demystify some of the mysteries of the internet for you as well.

Top comments (3)

Dhairav Mehta • Jan 9 '21

Very well explained!

Dinesh Balaji • Jan 9 '21

Nice article. One suggestion would be while explaining RSA you could use examples instead of alphabets. It's very difficult for someone to grasp it first time.

Sun-Li Beatteay • Jan 9 '21

Yeah that's a good point. I chose to use the variables because if you look up RSA in any context, you will see those variables used. It's tough to explain math concepts and theory in a succinct way, so I wanted to keep it high level.

This article would've been 3 times longer if I dived into all the different math concept behind it haha. Although you can check out this video if you want to see a deeper dive into the math theory: How Prime Numbers Keep the Internet Secure