YALE UNIVERSITY
DEPARTMENT OF COMPUTER SCIENCE

CPSC 461b: Foundations of Cryptography Notes 10 (rev. 1)
Professor M. J. Fischer February 12, 2009



Lecture Notes 10

23 Analyzing the Success Probability

We now complete the proof of Lemma 3 of section 21. Recall again that f is a strongly one-way and length preserving function and that

        df
g(x,r)  =  (f (x ),r)
        df
b(x,r)  =  x ⋅r mod 2.

Assuming that b is not a hard core for g, there is a p.p.t. algorithm G and a polynomial p(n) such that G predicts b with advantage

ε(n ) d=fε (n ) ≥--1-
       G      p(n )
(1)

for all n in an infinite set N. In section 22.2 of lecture 9, we constructed an algorithm A for inverting f. We now show that A has success probability at least -1--
p(n) at inverting f on length-n inputs, for all n ∈ N. This contradicts the assumption that f is strongly one-way and completes the proof of Lemma 3.

Assume for the rest of this discussion that n ∈ N. Let

s(x) = Pr[G (f (x ),Rn ) = b(x,Rn )].

Here Rn is a uniformly distributed random variable over length-n strings, distinct from the identically distributed random variables Un and Xn, which we also mention from time to time. Thus, s(x) is the fine-grained success probability of G for each particular length-n string x. We know that the average of s(x) taken over all length-n strings x is the overall success probability of G, so

∑
--xs(x)-= 1-+ ε(n).
  2n      2
(2)

Define

     {          n        1   ε(n)}
Sn =   x ∈ {0,1} | s(x ) ≥ 2-+--2-  .
(3)

Claim 1 |Sn|≥ ε(n) 2n.

Three different proofs of this claim are given in handout 3: One is algebraic, one is geometric, and one is based on Markov’s inequality. We do not repeat them here but refer the reader to the handout. We only mention that all three are based on the idea that in order for the average value of s(x) to exceed 1
2 + ε(n), there must be a certain number of x for which s(x) 1
2 + ε(n)
-2-. That number turns out to be ε(n) 2n.

Claim 2 x ∈ Sn,i ∈{1,,n},

             J            J    i         1  ℓ            1
P r[|{J | b(x,r )⊕ G(f(x),r  ⊕ e) = xi}| > 2-(2 - 1)] > 1- 2n.

Proof: Let x ∈ Sn and i ∈{1,,n}. Let ζJ be a random variable ranging over {0,1} such that

ζJ = 1   iff   b(x,rJ)⊕ G (f(x),rJ ⊕ ei) = xi
         iff   G (f (x),rJ ⊕ ei) = b(x,rJ ⊕ ei).
Thus, ζJ = 1 whenever G succeeds at computing xi. Note that rJ and rJ ei are uniformly distributed over {0,1}n. Hence, by 1 and 3,
    J              1-  ε(n)   1-  --1--
P r[ζ  = 1] = s(x) ≥ 2 +  2  ≥ 2 + 2p (n ).
(4)

A key observation is that the ζJ’s are pairwise independent. This follows from the fact that the rJ’s are pairwise independent. To see this, let JK. Without loss of generality, we can choose k ∈ K - J. By definition, rK and rJ are both sums of subsets of the independent random variables {s1,,s}. Since k ∈ K -J, the term sk appears in the sum for rK but not for rJ. Therefore, sk is independent of rJ, which implies that rK is independent of rJ.

Let m = 2 - 1. Since = log 2(2np(n)2 + 1), we have that 2 2log 2(2np(n)2+1) = 2np(n)2 + 1, so m 2n p(n)2. We also have

2ℓ ≤ 2log2(2n⋅p(n)2+1)+1 = 2 ⋅(2n⋅p (n )2 + 1)
(5)

We use Chebyshev’s inequality to bound Pr[ JζJ 1
2 m]. This is an upper bound on the probability that the majority value zi that algorithm A computes for xi is wrong. Recall Chebyshev’s inequality

                      Var(X )
P r[|X -  E(X )| ≥ δ] ≤---2---.
                        δ
(6)

Let X = JζJ and δ = 2mp(n). All of the ζJ are identically distributed, so we drop the superscript in the following.

                   1-  --1--
E (ζ) = P r[ζ = 1] ≥ 2 + 2p(n)

by equation 4. Thus,

                  (          )
                    1-  --1--
E(X ) = m ⋅E (ζ) ≥   2 + 2p(n)  ⋅m.
(7)

We also have

V ar(ζ) = E (ζ2)- E (ζ )2

Since ζ is 0-1 valued, it follows that ζ2 = ζ. Hence,

                                         1
V ar(ζ ) = E (ζ)- E (ζ)2 = E (ζ)(1 - E(ζ)) ≤--
                                         4

The bound of 14 simply reflects the fact that the maximum value of the function x(1 - x), which is reached for x = 12. Since the variables ζJ are pairwise independent, they are also uncorrelated, so

               (∑      )   ∑
V ar(X ) = Var      ζJ)  =    V ar(ζJ) ≤ m-.
                 J          J            4
(8)

Plugging the expression for δ into inequality 6 and doing some calculations using inequalities 7 and 8, we get

                        |                    |
   [     1   ]         [|    ( 1     1  )    |    1      ]
Pr  X ≤  2 ⋅m   ≤   Pr  ||X -   2 + 2p(n)  ⋅m || ≥ 2p(n-) ⋅m

                    V-ar(X)-
                ≤   (-m--)2
                     2p(n)
                    m-  (2p(n-))2-  p-(n-)2
                ≤   4 ⋅   m2    =   m
                      p(n)2     1
                ≤   --------2 = --.
                    2n ⋅p(n)    2n
This completes the proof of the claim. __

To finish the proof of the lemma, we observe that A successfully inverts f(x) if all of the following are true:

  1. x ∈ Sn.
  2. The guesses for the σi are all correct, that is, σi = b(x,sI) for all i.
  3. Each zi that A produces is correct.

The first event is true with probability at least ε(n) --1-
p(n) by Claim 1. The second event is true with probability

-1 ≥ ------1------
2ℓ   4n ⋅p(n)2 + 2

by inequality 5. The third event is true with probability at least (1 - 1
2n)n > 1
2 for all n 2. Multiplying these together gives us a lower bound on the success probability of f, namely,

-1--⋅------1------⋅ 1=  -------1--------.
p(n ) 4n ⋅p(n)2 + 2  2   8n ⋅p(n)3 + 4p(n)

Thus, taking q(n) = 8np(n)3 + 4p(n), A has a success probability greater than q1(n) for all sufficiently large n ∈ N, contradicting the assumption that f is strongly one-way. Thus, the assumption that b is not hard-core for f and that G exists must be false. This completes the proof of Lemma 3 of section 21.

24 Hard-Core Functions

We extend the notion of a hard-core predicate of a function to a hard-core function.

Definition: Let h : {0,1}* → {0,1}* be a polynomial-time computable length-regular function.1 Let = |h(1n)|. h is a hard core of f if for all p.p.t. algorithms D, all positive polynomials p(), and all sufficiently large n,

                                                      1
|P r[D ′(f (Xn ),h(Xn )) = 1]- P r[D ′(f(Xn),R ℓ(n)) = 1] <---,
                                                     p(n)

where Xn and R(n) are independent random variables, uniformly distributed over {0,1}n and {0,1}(n), respectively.

Intuitively, h is a hard core for f if the value of h(x) is indistinguishable from a random string, even knowing the value of f(x). On the surface, this looks to be an even stronger condition than unpredictability. Obviously, if one could predict h(x) from f(x), then one could distinguish h(x) from random. Namely, if the given string equals the prediction, output 1, otherwise output 0. On the other hand, it’s not a priori obvious how being able to distinguish h(x) from random would be useful at prediction.

Theorem 1 Let f be strongly one-way. Let c > 0 and let (n) = min{n,clog 2n⌉}. Let x be a string of length n and s a string of length 2n. Define

g2(x,s)  =   (f(x),s),
b (x,s)  =   x⋅(s   ,...,s  ), for i = 1,...,ℓ(n ),
 i              i+1      i+n
h (x,s)  =   b1(x,s) ...bℓ(|x|)(x,s).
Then h is a hard core of g2.

We omit the non-trivial proof of this theorem and remark only that hard core functions with logarithmic lengths are known for RSA and other cryptographic collections, assuming the corresponding collections are one-way. Details are in the textbook.

25 Probability Ensembles

To begin our formal development of pseudorandom sequence generation, we define a probability ensemble, analogous to the previous definition of a collection of one-way functions.

Let I be a countable set. An ensemble indexed by I is a sequence of random variables X = {Xi}i∈I indexed by I.

Typical index sets are the natural numbers or binary strings {0,1}*. Typically, X = {Xn}n∈ has Xn ranging over strings of length poly(n), and X = {Xw}w∈{0,1}* has Xw ranging over strings of length poly(|w|).

26 Polynomial Time Indistinguishability

We give two definitions of polynomial time indistinguishable ensembles, depending on the index set.

Variant 1: Ensembles X = {Xn}n∈ and Y = {Y n}n∈ are indistinguishable in polynomial time if for all probabilistic polynomial time algorithms D, all positive polynomials p(), and all sufficiently large n

           n                  n         -1--
|P r[D(Xn, 1 ) = 1]- Pr[D (Yn,1 ) = 1 ]| < p(n).

Variant 2: Ensembles X = {Xw}w∈{0,1}* and Y = {Y w}w∈{0,1}* are indistinguishable in polynomial time if for all probabilistic polynomial time algorithms D, all positive polynomials p(), and all sufficiently large n

                                          1
|P r[D (Xw, w) = 1]- P r[D (Yw,w ) = 1]| <-----.
                                        p(|w |)

Easy consequences

Let D be a p.p.t. algorithm. Let d(α) be the probability that D(α) = 1. Let dX(n) = E[d(Xn)], dY (n) = E[d(Y n)], be the expected value of D’s output when given a string from X or from Y , respectively. Let

δ(n) = |dX (n)- dY (n)|.

Then X and Y are indistinguishable by D iff δ is negligible in n.

Let wt(α) = # 1’s in α. If X, Y are polynomial-time indistinguishable, then

||   [          n]      [         n]||
|P r wt(Xn ) < -- - Pr  wt(Yn) < --|
               2                 2

regarded as a function of n is negligible. (Why?)

Note: The same applies to any polynomial-time computable string statistic.