RivestShamirAdelman (RSA) Encryption
The RSA algorithm is another cryptosystem based on an underlying hard problem. This algorithm was introduced in 1978 by Rivest, Shamir, and Adelman [RIV78]. As with the MerkleHellman algorithm, RSA has been the subject of extensive cryptanalysis. No serious flaws have yet been foundnot a guarantee of its security but suggesting a high degree of confidence in its use.
In this section, we present the RSA algorithm in two parts. First, we outline RSA, to give you an idea of how it works relative to the other algorithms we have studied. Then, we delve more deeply into a detailed analysis of the steps involved.
Introduction to the RSA Algorithm
On the surface, the RSA algorithm is similar to the MerkleHellman method, in that solving the encryption amounts to finding terms that add to a particular sum or multiply to a particular product. The RSA encryption algorithm incorporates results from number theory, combined with the difficulty of determining the prime factors of a target. The RSA algorithm also operates with arithmetic mod n.
Two keys, d and e, are used for decryption and encryption. They are actually interchangeable. (The keys for MerkleHellman were not interchangeable.) The plaintext block P is encrypted as Pe mod n. Because the exponentiation is performed mod n, factoring Pe to uncover the encrypted plaintext is difficult. However, the decrypting key d is carefully chosen so that (Pe)d mod n = P. Thus, the legitimate receiver who knows d simply computes (Pe)d mod n = P and recovers P without having to factor Pe.
The encryption algorithm is based on the underlying problem of factoring large numbers. The factorization problem is not known or even believed to be NP-complete; the fastest known algorithm is exponential in time.
Detailed Description of the Encryption Algorithm
The RSA algorithm uses two keys, d and e, which work in pairs, for decryption and encryption, respectively. A plaintext message P is encrypted to ciphertext C by
C = Pe mod n
The plaintext is recovered by
P = Cd mod n
Because of symmetry in modular arithmetic, encryption and decryption are mutual inverses and commutative. Therefore,
P = Cd mod n = (Pe)d mod n = (Pd)e mod n
This relationship means that one can apply the encrypting transformation and then the decrypting one, or the decrypting one followed by the encrypting one.
The encryption key consists of the pair of integers (e, n), and the decryption key is (d, n). The starting point in finding keys for this algorithm is selection of a value for n. The value of n should be quite large, a product of two primes p and q. Both p and q should be large themselves. Typically, p and q are nearly 100 digits each, so n is approximately 200 decimal digits (about 512 bits) long; depending on the application, 768, 1024, or more bits may be more appropriate. A large value of n effectively inhibits factoring n to infer p and q.
Next, a relatively large integer e is chosen so that e is relatively prime to (p - 1) * (q - 1). (Recall that "relatively prime" means that e has no factors in common with (p - 1) * (q - 1).) An easy way to guarantee that e is relatively prime to (p - 1) * (q - 1) is to choose e as a prime that is larger than both (p - 1) and (q - 1).
Finally, select d such that
e * d = 1 mod (P - 1) * (q - 1)
Mathematical Foundations of the RSA Algorithm
The Euler totient function φ(n) is the number of positive integers less than n that are relatively prime to n. If p is prime, then
φ(p) = p - 1
Furthermore, if n = p * q, where p and q are both prime, then
φ(n) = φ(p) * φ(q) = (p - 1) * (q - 1)
Euler and Fermat proved that
Let p = 11 and q = 13, so that n = p * q = 143 and φ(n) = (p - 1) * (q - 1) = 10 * 12 = 120. Next, an integer e is needed, and e must be relatively prime to (p - 1) * (q - 1). Choose e = 11.
The inverse of 11 mod 120 is also 11, since 11 * 11 = 121 = 1 mod 120. Thus, both encryption and decryption keys are the same: e = d = 11. (For the example, e = d is not a problem, but in a real application you would want to choose values where e is not equal to d.)
Let P be a "message" to be encrypted. For this example we use P = 7. The message is encrypted as follows: 711 mod 143 = 106, so that E
(7) = 106. (Note: This result can be computed fairly easily with the use of a common pocket calculator. 711 = 79 * 72. Then 7 9 = 40 353 607, but we do not have to work with figures that large. Because of the reducibility rule, a * b mod n = (a mod n) * (b mod n) mod n. Since
we will reduce our final result mod 143, we can reduce any term, such as 79, which is 8 mod 143. Then, 8 * 72 mod 143 = 392 mod 143 = 106.)
This answer is correct since D(106) = 10611 mod 143 = 7.
Use of the Algorithm
The user of the RSA algorithm chooses primes p and q, from which the value n = p * q is obtained. Next e is chosen to be relatively prime to (p - 1) * (q - 1); e is usually a prime larger than (p - 1) or (q - 1). Finally, d is computed as the inverse of e mod (φ(n)).
The user distributes e and n and keeps d secret; p, q, and φ(n) may be discarded (but not revealed) at this point. Notice that even though n is known to be the product of two primes, if they are relatively large (such as 100 digits long), it will not be feasible to determine the primes p and q or the private key d from e. Therefore, this scheme provides adequate security for d.
It is not even practical to verify that p and q themselves are primes, since that would require considering on the order of 1050 possible factors. A heuristic algorithm from Solovay and Strassen [SOL77] can determine the probability of primality to any desired degree of confidence.
Every prime number passes two tests. If p is prime and r is any number less than p, then
gcd(P,r) = 1
(where gcd is the greatest common divisor function) and
If a number is suspected to be prime but fails either of these tests, it is definitely not a prime. If a number is suspected to be a prime and passes both of these tests, the likelihood that it is prime is at least 1/2.
The problem relative to the RSA algorithm is to find two large primes p and q. With the Solovay and Strassen approach, you first guess a large candidate prime p. You then generate a random number r and compute gcd(p,r) and J(r,p). If either of these tests fails, p was not a prime, and you stop the procedure. If both pass, the likelihood that p was not prime is at most 1/2. The process repeats with a new value for r chosen at random. If this second r passes, the likelihood that a nonprime p could pass both tests is at most 1/4. In general, after the process
is repeated k times without either test failing, the likelihood that p is not a prime is at most 1/2k.
Zimmerman [ZIM86] gives a method for computing RSA encryptions efficiently.
Cryptanalysis of the RSA Method
Like the MerkleHellman knapsack algorithm, the RSA method has been scrutinized intensely by professionals in computer security and cryptanalysis. Several minor problems have been identified with it, but there have been no flaws as serious as those for the MerkleHellman method. Boneh [BON99] catalogs known attacks on RSA. He notes no successful attacks on RSA itself, but several serious but improbable attacks on implementation and use of RSA.
As it has done for symmetric encryption, RSA has set challenges for RSA encryption. (See http://www.rsasecurity.com/rsalabs/node.asp? id=2093 for more details.) These challenges involve finding the two prime factors of a composite number of a particular size. These numbers are identified by their size in decimal digits or bits. (For very rough approximations, three bits are close to one decimal digit.) The first challenges (using decimal digit lengths) were RSA-140, -155, -160, and -200. Then using bit lengths, the challenges were RSA-576, - 640, -704, -768, and so on. RSA-200, the last number in the decimal series, is 663 bits long, so its difficulty fits between RSA-640 and RSA-704.
Numbers through RSA-200 have been factored. The first, RSA-140, was factored in 1999 in 1 month using approximately 200 machines. RSA-155, a 512-bit number, was factored in 1999 in approximately 3.7 months using 300 machines, but RSA-160 was factored in 2003 in only 20 days. (It is believed that improvements in hardware were significant in that time reduction.) The most recent, RSA-200, was factored in 2005 [RSA05] after about 18 calendar months of elapsed time, using an unspecified number of machines.
As with the symmetric key challenges, these are not just academic exercises. They give a good indication of the state of the art in factoring and hence the sizes of numbers known to be factorable (and hence weak choices for key lengths). For most encryption an attacker would not be able to covertly assemble the number of high performance machines needed to factor a number quickly. Thus, 512 bits is probably too small for all but the least important uses, but slightly above that, 600 to 700 bits requires months for a dedicated network to crack. A key of about 768 bits is probably satisfactory for routine uses now, and a 1024-bit key will likely withstand very diligent attacks for some time.