YALE UNIVERSITY

DEPARTMENT OF COMPUTER SCIENCE

CPSC 461b: Foundations of Cryptography | Notes 12 (rev. 1) | |

Professor M. J. Fischer | February 19, 2009 | |

Lecture Notes 12

Definition: An ensemble X = {X_{n}}_{nℕ} is pseudorandom if X, U are indistinguishable in
polynomial time, where U = {U_{n}}_{nℕ} is the uniform ensemble.

Thus, X is pseudorandom if it “looks” the same to all probabilistic polynomial time algorithms.

Definition: A pseudorandom generator is a deterministic polynomial time function G that satisfies two properties:

- G maps strings of length n to strings of length ℓ(n) > n. ℓ(n) is called the expansion factor.
- {G(U
_{n})}_{nℕ}is pseudorandom.

We remark that if G is a pseudorandom generator, then G(U_{n}) is not statistically close to U_{ℓ(n)}. To see
this, let R_{G} = {G(x)∣x {0,1}^{n}} be the range of G. Clearly, |R_{G}|≤ 2^{n}, and for all y ⁄ R_{G},
Pr[G(U_{n}) = y] = 0. On the other hand, for the uniform ensemble, Pr[U_{ℓ(n)} = y] = . Hence, the
statistical difference

We now describe how to build a pseudorandom number generator G with polynomial expansion factor
starting from a generator G_{1} with expansion factor ℓ(n) = n + 1.

Fix a polynomial p(n). For s {0,1}^{n}, write the length-(n + 1) string G_{1}(s) as σs′, where |σ| = 1
and |s′| = n. On input s, iteratively define the sequences s_{0},s_{1},s_{2},…,s_{p(n)} and σ_{1},σ_{2},…,σ_{p(n)} as
follows:

The output of G(s) is the sequence σ_{1}σ_{2}…σ_{p(n)}. G(s) is easily computed in polynomial time by a simple
iterative program that calls G_{1} a total of p(n) times.

Theorem 1 If G_{1} is pseudorandom, then so is G.

Proof is by a hybrid argument. We let hybrid H_{n}^{k} consist of k uniform random bits followed by the
first p(n) - k bits of G(s_{0}), which we write as G(s_{o}):[1,p(n) - k]. In symbols,

Clearly, H_{n}^{0} = G(U_{n}) and H_{n}^{p(n)} = U_{p(n)}.

Suppose D distinguishes G(U_{n}) from U_{p(n)} with absolute probability difference δ(n). Then for some
k, D distinguishes H_{n}^{k} from H_{n}^{k+1} with absolute probability difference ≥ δ(n)∕p(n).

We now describe an algorithm D′ that attempts to distinguish G_{1}(U_{n}) from U_{n+1}. On length-(n + 1)
input α, D′ does the following:

1. | Write α = τ ⋅ α′, where |τ| = 1 and |α′| = n. |

2. | Choose index k uniformly from {0,1,…,p(n) - 1}. |

3. | Choose a uniformly distributed string β of length k. |

4. | Construct y = β ⋅ τ ⋅ G(α′):[1,p(n) - k - 1]. |

5, | Compute and output D(y). |

If α is uniformly distributed, then τ and α′ are both uniformly distributed, so y = H_{n}^{k+1}. On the other
hand, if α = G_{1}(s_{0}), where s_{0} is uniformly distributed, then τ = σ_{1} and α′ = s_{1}, so y = H_{n}^{k}. This is
because

Hence, D′ distinguishes G_{1}(U_{n}) from U_{n+1} with absolute probability difference ≥ δ(n)∕p(n).

We omit the remaining details of showing how this leads to a contradiction of the assumption that G is not pseudorandom.

Our formal definition of pseudorandomness is based on the indistinguishability of an entire
polynomial-length generated sequence from a uniformly distributed random sequence. However, the
traditional notion of a pseudorandom generator is based on repeated experiments. The output bits x_{1},x_{2},…
are assumed to be generated one at a time. The generator is called pseudorandom if each x_{i} “appears” to
result from an independent and uniformly distributed random event such as the flip of a fair
coin.

The notion of “appears” is can be captured in terms of unpredictability. We say that x_{i+1} is
unpredictable if no polynomial time algorithm that attempts to guess it is correct with more than a tiny
advantage over chance, even given all of the prior bits x_{1},…,x_{i}.

More formally, a predictor is a p.p.t. algorithm A that is allowed to read the input sequence x a bit at a
time in order. After reading bit i, the algorithm can choose to output a guess b and halt, or it can continue.
In any case, it must halt and emit a guess after reading the next-to-last bit of x. Let k be the last bit read by
A. Then A is correct if k < |x| and b = x_{k+1}. In addition to the input x, which A is allowed to read only a
bit at a time, A is also given an input 1^{n}, where n = |x|. This way, A can determine the length of x without
having to read it all.

Notation: The textbook uses the notation next_{A}(x) to denote the next bit of x following the last bit that
A read. The intent is that the event [A(1^{|Xn|},X_{n}) = next_{A}(X_{n})] should mean that a string x is chosen
according to the distribution X_{n}, A is run on inputs 1^{n} and x, A reads the first k bits of x for some k and
outputs b, and b = x_{k+1}, the “next” bit of x. That is, the event is that A correctly predicts some bit of a
randomly chosen x from distribution X_{n}.

A better notation would make k explicit. For example, we could pretend that A outputs a pair (k,b)
with the meaning that k is the index of the last bit of x that A read, and b is A’s prediction for x_{k+1}. We
could then define next_{A}(x) = {(k,x_{k+1})∣k [0,n - 1]}. Now, A correctly predicts the next bit if
A(1^{|x|},x) = (k,b) and (k,b) next_{A}(x).

Definition: An ensemble {X_{n}}_{nℕ} is called unpredictable in polynomial time if for every p.p.t. A,
every positive polynomial p(⋅), and all sufficiently large n,

Theorem 2 An ensemble X is pseudorandom if and only if it is unpredictable in polynomial time.

Proof:

(⇒) The theorem in the forward direction is straightforward. We sketch the general ideas and leave
the details to the reader.

If there were a predictor A for X, then a distinguisher D is easily built. Namely, D(x) outputs 1
iff A(1^{|x|},x) correctly predicts the next bit. If x comes from X, D(x) will output 1 with probability
at least + , but if x comes from U, then clearly D(x) will output 1 with probability exactly .
Hence, D successfully distinguishes X from U.

(⇐) The theorem in the reverse direction is proved by another hybrid argument. We sketch a few of
the main ideas. Assume X is both unpredictable but not pseudorandom. Then there is a distinguisher
D such that

for infinitely many n. We may without loss of generality drop the absolute value brackets and assume that

for infinitely many n. The reasoning is that either Pr[D(X_{n}) = 1] ≥ Pr[D(U_{n}) = 1] for
infinitely many n, or Pr[D(X_{n}) = 1] ≤ Pr[D(U_{n}) = 1] for infinitely many n. If the latter,
then Pr[(X_{n}) = 1] ≥ Pr[(U_{n}) = 1] for the algorithm that is identical to D except that it
complements the output.

We build a next-bit predictor A. Let hybrid H_{n}^{k} consist of the first k bits from X_{n} followed
by the last n - k bits from U_{n}. Then H_{n}^{n} = X_{n} and H_{n}^{0} = U_{n}. The predictor A(1^{|x|},x)
guesses a number k [0,|x|- 1], reads only the first k bits of x, and constructs the string
y = x_{1},…,x_{k},u_{k+1},…,u_{n}, where the u_{j}’s are uniformly distributed random bits. It then runs D(y).
If D(y) = 1, then A predicts bit k + 1 to be u_{k+1}. Otherwise, A predicts bit k + 1 to be ¬u_{k+1} (the
complement of u_{k+1}).

We omit the non-trivial analysis needed to show that algorithm A has a sufficient advantage as a next-bit predictor to contradict the assumption that X is unpredictable. __

We now show that the existence of pseudorandom generators implies the existence of one-way functions.

Theorem 3 Let G be a pseudorandom generator with expansion factor ℓ(n) = 2n. Define the function f(x,y) = G(x) for all |x| = |y|. Then f is a strongly one-way function.

Proof: Suppose f is not strongly one-way. Let A be an inverter for f(U_{2n}) with success probability
at least for infinitely many n. We construct a distinguisher D that distinguishes G(U_{n}) from
U_{2n} on those same n.

D(α) uses A to attempt to find β such that f(β) = α. If A succeeds, then D outputs 1; otherwise D
outputs 0. Since f(U_{2n}) = G(U_{n}), then

| (1) |

On the other hand,

| (2) |

This is because f(x,y) depends only on x, so the range of f on pairs of length-n inputs has size ≤ 2^{n}.
Since f(A(U_{2n})) is in the range of f, the probability that U_{2n} is in the range, much less actually equal to
f(A(U_{2n})), is at most 2^{-n}. Subtracting 2 from 1 gives

| (3) |

Thus, D distinguishes G(U_{n}) from U_{2n} for infinitely many n, contradicting the assumption that G is a
pseudorandom generator. __