DEV Community


Posted on

Primality Testing

Number Theory Basics

Most algorithms in cryptography depend on large prime numbers, and the fact that certain algorithms are hard. Encryption is basically the act of applying a mathematical function for which it is very hard to determine the inverse function without a key, but easy with the key.

There are many number theoretic algorithms which are fast and easy. The mother of them all number theoretic algorithms is Euclid's algorithm for finding the greatest common divisor:

Given that we can do division with remainder, if a>ba > b and we want to compute GCD(a,b) we write

a=bq+r,a = bq + r, and notice GCD(a,b)=GCD(b,r).GCD(a,b) = GCD(b,r). Now we can iterate this. This is very fast; in fact its time complexity is O(log(a+b))O(log(a + b)) !

If aa and bb are really large and hard to factor, we can compute their GCD extremely quickly without knowing how to factor them! In fact there is no known way to efficiently determine the prime factorization of a number - if I multiply two random 300-digit primes pp and qq it's impractical to find pp and qq if you only know the value of pq,pq, although it is pretty easy to detect that pqpq is not prime. You could write code to do it, but your computer would not last long enough to execute it.

Powers modulo pp

If dd is an integer, we write ab (mod d)a \equiv b ~ (mod ~ d) if aa and bb have the same remainder when we divide both by p.p.

If pp is a prime number, we can define multiplication on

by taking the result mod p, that is, taking the remainder when we divide by p.p. We can also define exponents in a similar way: ana^n is just the remainder when ana^n is divided by p.p.

How do we actually compute ana^n efficiently when a,p,na,p,n are all large?

The answer is repeated squaring. We can write nn as a sum of powers of 22 , and calculate those powers of , i.e. if n=453,n = 453,

453=256+128+64+4+1=28+27+26+22+20. 453 = 256 + 128 + 64 + 4 + 1 = 2^8 + 2^7 + 2^6 + 2^2 + 2^0.

So if we want to compute 17453,17^{453}, we actually only have to calculate the powers of 1717 whose exponents belong to

All told, this is only 1212 multiplication operations to compute 1745317^{453} ( 88 to compute the powers with exponents in the list, and 44 more to get the correct one).

Logarithms modulo pp

Another well known fact in number theory: for any prime number p,p, you can find a remainder aa such that all of the different powers a,a2,a3,a4,...,ap1a, a^2, a^3, a^4, ... , a^{p-1} are different mod pp ; in this case we say that aa is a primitive root mod p.p.

We can get at most p1p-1 different remainders, since no power of aa will give a remainder 00 . Most values of aa will not be primitive roots, but in practice it usually not that hard to find one such value fairly quickly.) Values of aa with this property are called primitive roots. For example, the primitive roots mod 55 are 22 and 33 .

powers of 1: 1,1,1,1 (not all distinct mod 5)
powers of 2: 2,4,8,16 (all distinct mod 5)
powers of 3: 3,9,27,81 (all distinct mod 5)
powers of 4: 4,16,64,256 (not all distinct mod 5)

For a = 1,2,3,4 we have

a41 (mod 5)a^4 \equiv 1 ~ (mod ~ 5)

In fact, if pp is a large odd number and 1<a<p11 < a < p - 1 and
ap11 (mod p)a^{p-1} \equiv 1 ~ (mod ~ p) then pp is almost certainly prime, although there are a small number of "pseudoprimes" that pass this test which are not prime.

However, we can detect these pseudoprimes quickly with high probability! And we will want to do this, because even a failure rate of 1 in a million is enough to cause a lot of problems for a company like Google or American Express.

The Miller-Rabin Primality Test

However, if pp is a large pseudoprime that managed to sneak past the Fermat test, there are two cases that we might have to worry about here. The first is that the pseudoprime p could be a power of a prime, say p = p=vkp = v^k where vv is prime.

Then pp is already extremely unlikely to escape the Fermat test - a random guess will only pass with probability 1 in vk1.v^{k-1}. A relatively small number of Fermat tests with randomly chosen values of aa will detect that pp is not prime with an acceptably low failure rate.

For any true prime pp and primitive root aa we know ak(p1)/2p1 (mod p)a^{k(p-1)/2} \equiv p - 1 ~ (mod ~ p) if kk is odd, and 1 if kk is even, and 1 and p1p-1 are the only two square roots it has.

The other case is when p=vwp = vw where v,wv,w are odd and greater than 1 but and have no prime factors. We can take advantage of their existence without being able to find vv and w.w.

Anything that is 1 (mod vv ) and w1w-1 (mod ww ) will be another square root in of 1 in mod pp arithmetic, which will be different from 1 and p1.p-1.

Now, the key idea here is that if I pick a random integer from 0 to p1,p-1, the remainders mod v and mod w are independent and uniformly distributed. What we do here is decorate the Fermat test a little bit.

First, find the unique odd number d for which we can write

p1=d2e.p - 1 = d \cdot 2^e.

Now, we will do Fermat tests, with a random number aa as before, but keep track of sightly more information. In what follows, all calculations are to be reduced modulo p.p.

1.) Let m=ad.m = a^d.

2.) Calculate m,m2,...,m2e.m, m^2, ... , m^{2^e}.

If the last result isn't 1, you have failed the Fermat test and you know the result is composite. However, even if you did not, you can look at the last power of m in the list for which the value was different from 1. If it's not p1,p-1, then you have failed the Rabin-Miller primality test.

There's an entirely elementary proof that the Rabin-Miller test fails with probability at least 1/2 when pp is not prime.

Sharing Secrets using Primes

We can think about logarithms as the inverse of exponentiation, similarly to how we do for positive real numbers. log( bb ) ( with base aa ) is the power nn for which ana^n = b.b.

The key fact here is that when pp is a large prime number, the "logarithm" operation is computationally intractable.

Given aa and n,n, it's easy to find an.a^n. However, if we know the values of aa and b,b, in general it is quite hard to find nn such that

an  b (mod p). a^n ~ \equiv ~ b ~ (mod ~ p).

So here's a quick way for two parties to exploit this fact and pass information while maintaining their own privacy:

Alice and Bob have a shared prime number p,qp,q and primitive root a.a. (In fact aa doesn't need to be a primitive root, but needs to just have many distinct powers.)

Each has a private key, say xx for Alice and yy for Bob (they named them after their sex chromosomes).

Alice tells Bob axa^x and Bob tells Alice ay.a^y.

The values of xx and yy are secure (difficult for either to steal from the other or for an eavesdropper to intercept), but they can both access the value of axya^{xy} to decrypt any information that needs to be available to both of them.

In fact the standard cryptographic system is a small variation on this: Instead of a large prime, we choose a product qq of two large primes in place of pp and a number like aa in place of pp , and those two pieces of infomration combined are Alice's public key. This has the additional layer of security that the exponent we must raise a number to to get 1 (mod qq ) is hard to compute.

Top comments (0)