YALE UNIVERSITY
DEPARTMENT OF COMPUTER SCIENCE

	CPSC 467a: Cryptography and Computer Security	Notes 11 (rev. 1)
Professor M. J. Fischer		October 13, 2008

Lecture Notes 11

45 Generating RSA Encryption and Decryption Exponents

We showed in section 44 (lecture notes 10) that RSA decryption works for m Z*_n if e and d are chosen so that

ed ≡ 1 (mod ϕ(n)),

(1)

that is, d is e^-1 (the inverse of e) in Z*_ϕ(n).

We now turn to the question of how Alice chooses e and d to satisfy (1). One way she can do this is to choose a random integer e Z*_ϕ(n) and then solve (1) for d. We will show how to solve for d in Sections 46 and 47 below.

However, there is another issue, namely, how does Alice find random e Z*_ϕ(n)? If Z*_ϕ(n) is large enough, then she can just choose random elements from Z_ϕ(n) until she encounters one that also lies in Z*_ϕ(n). A candidate element e lies in Z*_ϕ(n) if gcd(e,ϕ(n)) = 1, which can be computed efficiently using Algorithm 42.2 (Euclidean algorithm).¹

But how large is large enough? If ϕ(ϕ(n)) (the size of Z*_ϕ(n)) is much smaller than ϕ(n) (the size of Z_ϕ(n)), Alice might have to search for a long time before finding a suitable candidate for e.

In general, Z*_m can be considerably smaller than m. For example, if m = |Z_m| = 210, then |Z*_m| = 48. In this case, the probability that a randomly-chosen element of Z_m falls in Z*_m is only 48∕210 = 8∕35 = 0.228… .

The following theorem provides a crude lower bound on how small Z*_m can be relative to the size of Z_m that is nevertheless sufficient for our purposes.

Theorem 1 For all m ≥ 2,

* |Z-m|≥ -----1-----. |Zm | 1+ ⌊log2m ⌋

Proof: Write m in factored form as m = ∏ _i=1^tp_i^e_i, where p_i is the i^th prime that divides m and e_i ≥ 1. Then ϕ(m) = ∏ _i=1^t(p_i - 1)p_i^e_i-1, so

∏ ( ) |Z-*m|- ϕ(m-) --ti=1-(pi---1)peii-1- ∏t pi --1- |Zm | = m = ∏t pei = pi . i=1 i i=1

(2)

To estimate the size of ∏ _i=1^t(p_i - 1)∕p_i, note that (p_i - 1)∕p_i ≥ i∕(i + 1). This follows since (x- 1)∕x is monotonic increasing in x, and p_i ≥ i + 1. Then

t ( ) t ( ) ∏ pi --1 ≥ ∏ -i--- = 1⋅ 2⋅ 3⋅⋅⋅--t--= -1--. pi i+ 1 2 3 4 t+ 1 t+ 1 i=1 i=1

(3)

Clearly t ≤⌊log ₂m⌋ since 2^t ≤∏ _i=1^tp_i ≤ m and t is an integer. Combining this fact with equations (2) and (3) gives the desired result. __

For n a 1024-bit integer, ϕ(n) < n < 2¹⁰²⁴. Hence, log ₂(ϕ(n)) < 1024, so ⌊log ₂(ϕ(n))⌋≤ 1023. By Theorem 1, the fraction of elements in Z_ϕ(n) that also lie in Z*_ϕ(n) is at least 1/1024. Therefore, the expected number of random trials before Alice finds a number in Z*_ϕ(n) is provably at most 1024 and is most likely much smaller.

46 Diophantine equations and modular inverses

Now that Alice knows how to choose e Z*_ϕ(n), how does she find d? That is, how does she solve (1)? Note that d, if it exists, is a multiplicative inverse of e (mod n), that is, a number that, when multiplied by e, gives 1 (mod n).

Equation (1) is an instance of the general Diophantine equation

ax + by = c

(4)

Here, a,b,c are given integers. A solution consists of integer values for the unknowns x and y. To put (1) into this form, we note that ed ≡ 1 (mod ϕ(n)) iff ed + uϕ(n) = 1 for some integer u. This is seen to be an equation in the form of (4) where the unknowns x and y are d and u, respectively, and the coefficients a,b,c are e,ϕ(n),1, respectively.

47 Extended Euclidean algorithm

It turns out that (4) is closely related to the greatest common divisor, for it has a solution iff gcd(a,b)∣c. It can be solved by a process akin to the Euclidean algorithm, which we call the Extended Euclidean algorithm. Here’s how it works.

The algorithm generates a sequence of triples of numbers T_i = (r_i,u_i,v_i), each satisfying the invariant

ri = aui + bvi ≥ 0.

(5)

The first triple T₁ is (a,1,0) if a ≥ 0 and (-a,-1,0) if a < 0. The second trip T₂ is (b,0,1) if b ≥ 0 and (-b,0,-1) if b < 0.

The algorithm generates T_i+2 from T_i and T_i+1 much the same as the Euclidean algorithm generates (a mod b) from a and b. More precisely, let q_i+1 = ⌊r_i∕r_i+1⌋. Then T_i+2 = T_i - q_i+1T_i+1, that is,

ri+2 = ri - qi+1ri+1 u = u - q u i+2 i i+1 i+1 vi+2 = vi - qi+1vi+1

Note that r_i+2 = (r_i mod r_i+1), ² so one sees that the sequence of generated pairs (r₁,r₂), (r₂,r₃), (r₃,r₄), …, is exactly the same as the sequence of pairs generated by the Euclidean algorithm. Like the Euclidean algorithm, we stop when r_t = 0. Then r_t-1 = gcd(a,b), and from (5) it follows that

gcd(a,b) = au + bv t-1 t- 1

(6)

Returning to equation (4), if c = gcd(a,b), then x = u_t-1 and y = v_t-1 is a solution. If c is a multiple of gcd(a,b), then c = k gcd(a,b) for some k and x = ku_t-1 and y = kv_t-1 is a solution. Otherwise, gcd(a,b) does not divide c, and one can show that (4) has no solution. See Handout 6 for further details, as well as for a discussion of how many solutions (4) has and how to find all solutions.