YALE UNIVERSITY
DEPARTMENT OF COMPUTER SCIENCE

	CPSC 461b: Foundations of Cryptography	Notes 12 (rev. 1)
Professor M. J. Fischer		February 19, 2009

Lecture Notes 12

29 Pseudorandom Generators

Definition: An ensemble X = {X_n}_nℕ is pseudorandom if X, U are indistinguishable in polynomial time, where U = {U_n}_nℕ is the uniform ensemble.

Thus, X is pseudorandom if it “looks” the same to all probabilistic polynomial time algorithms.

Definition: A pseudorandom generator is a deterministic polynomial time function G that satisfies two properties:

We remark that if G is a pseudorandom generator, then G(U_n) is not statistically close to U_ℓ(n). To see this, let R_G = {G(x)∣x

{0,1}ⁿ} be the range of G. Clearly, |R_G|≤ 2ⁿ, and for all y ⁄

R_G, Pr[G(U_n) = y] = 0. On the other hand, for the uniform ensemble, Pr[U_ℓ(n) = y] = -1ℓ
2

. Hence, the statistical difference

∑ Δ(ℓ(n)) = 1- |P r[G(Un ) = α] - Pr[Uℓ(n) = α]| 2 α∈{0,1}ℓ(n) ∑ ≥ 1- |P r[G(Un ) = α ]- Pr[Uℓ(n) = α]| 2 α∈RG 1 ∑ 1 = -- |0 - -ℓ| 2 α∈RG 2 ℓ n = 1-⋅ 2--ℓ2- ≥ 1, 2 2 4

We now describe how to build a pseudorandom number generator G with polynomial expansion factor starting from a generator G₁ with expansion factor ℓ(n) = n + 1.

Fix a polynomial p(n). For s

{0,1}ⁿ, write the length-(n + 1) string G₁(s) as σs′, where |σ| = 1 and |s′| = n. On input s, iteratively define the sequences s₀,s₁,s₂,…,s_p(n) and σ₁,σ₂,…,σ_p(n) as follows:

The output of G(s) is the sequence σ₁σ₂…σ_p(n). G(s) is easily computed in polynomial time by a simple iterative program that calls G₁ a total of p(n) times.

Theorem 1 If G₁ is pseudorandom, then so is G.

Proof is by a hybrid argument. We let hybrid H_n^k consist of k uniform random bits followed by the first p(n) - k bits of G(s₀), which we write as G(s_o):[1,p(n) - k]. In symbols,

Suppose D distinguishes G(U_n) from U_p(n) with absolute probability difference δ(n). Then for some k, D distinguishes H_n^k from H_n^k+1 with absolute probability difference ≥ δ(n)∕p(n).

We now describe an algorithm D′ that attempts to distinguish G₁(U_n) from U_n+1. On length-(n + 1) input α, D′ does the following:

If α is uniformly distributed, then τ and α′ are both uniformly distributed, so y = H_n^k+1. On the other hand, if α = G₁(s₀), where s₀ is uniformly distributed, then τ = σ₁ and α′ = s₁, so y = H_n^k. This is because

Hence, D′ distinguishes G₁(U_n) from U_n+1 with absolute probability difference ≥ δ(n)∕p(n).

We omit the remaining details of showing how this leads to a contradiction of the assumption that G is not pseudorandom.

30 Unpredictability

Our formal definition of pseudorandomness is based on the indistinguishability of an entire polynomial-length generated sequence from a uniformly distributed random sequence. However, the traditional notion of a pseudorandom generator is based on repeated experiments. The output bits x₁,x₂,… are assumed to be generated one at a time. The generator is called pseudorandom if each x_i “appears” to result from an independent and uniformly distributed random event such as the flip of a fair coin.

The notion of “appears” is can be captured in terms of unpredictability. We say that x_i+1 is unpredictable if no polynomial time algorithm that attempts to guess it is correct with more than a tiny advantage over chance, even given all of the prior bits x₁,…,x_i.

More formally, a predictor is a p.p.t. algorithm A that is allowed to read the input sequence x a bit at a time in order. After reading bit i, the algorithm can choose to output a guess b and halt, or it can continue. In any case, it must halt and emit a guess after reading the next-to-last bit of x. Let k be the last bit read by A. Then A is correct if k < |x| and b = x_k+1. In addition to the input x, which A is allowed to read only a bit at a time, A is also given an input 1ⁿ, where n = |x|. This way, A can determine the length of x without having to read it all.

Notation: The textbook uses the notation next_A(x) to denote the next bit of x following the last bit that A read. The intent is that the event [A(1^|X_n|,X_n) = next_A(X_n)] should mean that a string x is chosen according to the distribution X_n, A is run on inputs 1ⁿ and x, A reads the first k bits of x for some k and outputs b, and b = x_k+1, the “next” bit of x. That is, the event is that A correctly predicts some bit of a randomly chosen x from distribution X_n.

A better notation would make k explicit. For example, we could pretend that A outputs a pair (k,b) with the meaning that k is the index of the last bit of x that A read, and b is A’s prediction for x_k+1. We could then define next_A(x) = {(k,x_k+1)∣k

[0,n - 1]}. Now, A correctly predicts the next bit if A(1^|x|,x) = (k,b) and (k,b)

next_A(x).

Definition: An ensemble {X_n}_nℕ is called unpredictable in polynomial time if for every p.p.t. A, every positive polynomial p(⋅), and all sufficiently large n,

Theorem 2 An ensemble X is pseudorandom if and only if it is unpredictable in polynomial time.

Proof:
(⇒) The theorem in the forward direction is straightforward. We sketch the general ideas and leave the details to the reader.

If there were a predictor A for X, then a distinguisher D is easily built. Namely, D(x) outputs 1 iff A(1^|x|,x) correctly predicts the next bit. If x comes from X, D(x) will output 1 with probability at least 1
2

, but if x comes from U, then clearly D(x) will output 1 with probability exactly 1
2

. Hence, D successfully distinguishes X from U.
(⇐) The theorem in the reverse direction is proved by another hybrid argument. We sketch a few of the main ideas. Assume X is both unpredictable but not pseudorandom. Then there is a distinguisher D such that

for infinitely many n. We may without loss of generality drop the absolute value brackets and assume that

for infinitely many n. The reasoning is that either Pr[D(X_n) = 1] ≥ Pr[D(U_n) = 1] for infinitely many n, or Pr[D(X_n) = 1] ≤ Pr[D(U_n) = 1] for infinitely many n. If the latter, then Pr[

(X_n) = 1] ≥ Pr[

(U_n) = 1] for the algorithm

that is identical to D except that it complements the output.

We build a next-bit predictor A. Let hybrid H_n^k consist of the first k bits from X_n followed by the last n - k bits from U_n. Then H_nⁿ = X_n and H_n⁰ = U_n. The predictor A(1^|x|,x) guesses a number k

[0,|x|- 1], reads only the first k bits of x, and constructs the string y = x₁,…,x_k,u_k+1,…,u_n, where the u_j’s are uniformly distributed random bits. It then runs D(y). If D(y) = 1, then A predicts bit k + 1 to be u_k+1. Otherwise, A predicts bit k + 1 to be ¬u_k+1 (the complement of u_k+1).

We omit the non-trivial analysis needed to show that algorithm A has a sufficient advantage as a next-bit predictor to contradict the assumption that X is unpredictable. __

31 Pseudorandom Generators and One-Way Functions

We now show that the existence of pseudorandom generators implies the existence of one-way functions.

Theorem 3 Let G be a pseudorandom generator with expansion factor ℓ(n) = 2n. Define the function f(x,y) = G(x) for all |x| = |y|. Then f is a strongly one-way function.

Proof: Suppose f is not strongly one-way. Let A be an inverter for f(U_2n) with success probability at least -1--
p(n)

for infinitely many n. We construct a distinguisher D that distinguishes G(U_n) from U_2n on those same n.

D(α) uses A to attempt to find β such that f(β) = α. If A succeeds, then D outputs 1; otherwise D outputs 0. Since f(U_2n) = G(U_n), then

This is because f(x,y) depends only on x, so the range of f on pairs of length-n inputs has size ≤ 2ⁿ. Since f(A(U_2n)) is in the range of f, the probability that U_2n is in the range, much less actually equal to f(A(U_2n)), is at most 2^-n. Subtracting 2 from 1 gives

Thus, D distinguishes G(U_n) from U_2n for infinitely many n, contradicting the assumption that G is a pseudorandom generator. __