YALE UNIVERSITY
DEPARTMENT OF COMPUTER SCIENCE

	CPSC 467a: Cryptography and Computer Security	Notes 23 (rev. 1)
Professor M. J. Fischer		December 3, 2008

Lecture Notes 23

107 Bit-prediction

One important property of the uniform distribution U on bit-strings b₁,…,b_ℓ is that the individual bits are statistically independent from each other. This means that the probability that a particular bit b_i = 1 is unaffected by the values of the other bits in the sequence. Thus, any algorithm that attempts to predict b_i, even knowing other bits of the sequence, will be correct only 1/2 of the time. We now translate this property of unpredictability to pseudorandom sequences.

107.1 Next-bit prediction

One property we would like a pseudorandom sequence to have is that it be difficult to predict the next bit given the bits that came before.

where (b₁,…,b_i) = G_i(S). To explain this notation, S is a uniformly distributed random variable ranging over the possible seeds for G. G(S) is a random variable (i.e., probability distribution) over the output strings of G, and G_i(S) is the corresponding probability distribution on the length-i prefixes of G(S).

Next-bit prediction is closely related to indistinguishability, introduced in section 104. We will show later that if if G(S) has a next-bit predictor for some bit i, then G(S) is distinguishable from the uniform distribution U on the same length strings, and conversely, if G(S) is distinguishable from U, then there is a next-bit predictor for some bit i of G(S). The precise definitions under which this theorem is true are subtle, for one must quantify both the amount of time the judge and next-bit predictor algorithms are permitted to run as well as how much better than chance the judgments or predictions must be in order to be considered a successful judge or next-bit predictor. We defer the mathematics for now and focus instead on the intuitive concepts that underlie this theorem.

107.2 Building a judge from a next-bit predictor

Suppose a PRSG G has an ϵ-next-bit predictor A for some bit i. Here’s how to build a judge J that distinguishes G(S) from U. The judge J, given a sample string drawn from either G(S) or from U, runs algorithm A to guess bit b_i from bits b₁,…,b_i-1. If the guess agrees with the real b_i, then J outputs 1 (meaning that he guesses the sequence came from G(S)). Otherwise, J outputs 0. For sequences drawn from G(S), J will output 1 with the same probability that A successfully predicts bit b_i, which is at least 1∕2 + ϵ. For sequences drawn from U, the judge will output 1 with probability exactly 1/2. Hence, the judge distinguishes G(S) from U with advantage ϵ.

It follows that no cryptographically strong PRSG can have an ϵ-next-bit predictor. In other words, no algorithm that attempts to predict the next bit can have more than a “small” advantage ϵ over chance.

107.3 Previous-bit prediction

Previous-bit prediction, while perhaps less natural, is analogous to next-bit prediction. An ϵ-previous-bit predictor for bit i is an algorithm that, given bits b_i+1,…,b_ℓ, correctly predicts b_i with probability at least 1∕2 + ϵ.

As with next-bit predictors, it is the case that if G(S) has a previous-bit predictor for some bit b_j, then some judge distinguishes G(S) from U. Again, I am being vague with the exact conditions under which this is true. The somewhat surprising fact follows that G(S) has an ϵ-next-bit predictor for some bit i if and only if it has an ϵ′-previous-bit predictor for some bit j (where ϵ and ϵ′ are related but not necessarily equal).

To give some intuition into why such a fact might be true, we look at the special case of ℓ = 2, that is, of 2-bit sequences. The probability distribution G(S) can be described by four probabilities

We describe an algorithm A(v) for predicting b₁ given b₂ = v. A(v) predicts b₁ = 0 if p_0,v > p_1,v, and it predicts b₁ = 1 if p_0,v ≤ p_1,v. In other words, the algorithm chooses the value for b₁ that is most likely given that b₂ = v. Let a(v) be the value predicted by A(v).

Theorem 1 If A is an ϵ-previous-bit predictor for b₁, then A is an ϵ-next-bit predictor for either b₁ or b₂.

Proof: Assume A is an ϵ-previous-bit predictor for b₁. Then A correctly predicts b₁ given b₂ with probability at least 1∕2 + ϵ. We show that A is an ϵ-next-bit predictor for either b₁ or b₂.

Case 1: a(0) = a(1). Then algorithm A does not depend on v, so A itself is also an ϵ-next-bit predictor for b₁.

Case 2: a(0)≠a(1). The probability that A(v) correctly predicts b₁ when b₂ = v is given by the conditional probability

The overall probability that A(b₂) is correct for b₁ is the weighted average of the conditional probabilities for v = 0 and v = 1, weighted by the probability that b₂ = v. Thus,

prob[A(b2) is correct for b1] = ∑ prob[b1 = a(v) | b2 = v]⋅prob[b2 = v ] v∈{0,1} ∑ = pa(v),v v∈{0,1} = pa(0),0 + pa(1),1

Now, since a(0)≠a(1), the function a is one-to-one and onto. A simple case analysis shows that either a(v) = v for v

{0,1}, or a(v) = ¬v for v

{0,1}. That is, a is either the identity or the complement function. In either case, a is its own inverse. Hence, we may also use algorithm A(u) as a predictor for b₂ given b₁ = u. By a similar analysis to that used above, we get

∑ prob[A (b1) is correct for b2] = prob[b2 = a(u) | b1 = u]⋅prob[b1 = u] v∈∑{0,1} = pu,a(u) v∈{0,1} = p + p 0,a(0) 1,a(1)

since either a is the identity function or the complement function. Hence, A(b₁) is correct for b₂ with the same probability that A(b₂) is correct for b₁. Therefore, A is an ϵ-next-bit predictor for b₂.

In both cases, we conclude that A is an ϵ-next-bit predictor for either b₁ or b₂. __

108 Two-Part Secret Splitting

There are many situations in which one wants to grant access to a resource only if a sufficiently large group of agents cooperate. For example, a store safe might require both the manager’s key and the armored car driver’s key in order to be opened. This protects the store against a dishonest manager or armored car driver, and it also prevents an armed robber from coercing the manager into opening the safe. A similar 2-key system is used for safe deposit boxes in banks.

We might like to achieve the same properties for cryptographic keys or other secrets. For example, if k is the secret decryption key for a cryptosystem, one might wish to split k into two shares k₁ and k₂. By themselves, neither k₁ nor k₂ reveals any information about k, but when suitably combined, k can be recovered. A simple way to do this is to choose k₁ uniformly at random and then let k₂ = k ⊕ k₁. Both k₁ and k₂ are uniformly distributed over the key space and hence give no information about k. However, combined with XOR, they reveal k, since k = k₁ ⊕ k₂.

Indeed, the one-time pad cryptosystem of lecture notes 3, section 14, can be viewed as an instance of secret splitting. Here, Alice’s secret is her message m. The two shares are the ciphertext c and the key k. Neither by themselves gives any information about m, but together they reveal m = k ⊕ c.

109 Multi-Part Secret Splitting

Secret splitting generalizes to more than two shares. Imagine a large company that restricts access to important company secrets to only its five top executives, say the president, vice-president, treasurer, CEO, and CIO. They don’t want any executive to be able to access the data alone since they are concerned that an executive might be blackmailed into giving confidential data to a competitor. On the other hand, they also don’t want to require that all five executives get together to access their data, both because this would be cumbersome and also because they worry about the death or incapacitation of any single individual. They decide as a compromise that any three of them should be able to access the secret data, but not one or two of them operating alone.

A (τ,k) threshold secret splitting scheme splits a secret s into shares s₁,…,s_k. Any subset of τ or more shares allows s to be recovered, but no subset of shares of size less than τ gives any information about s.

110 Shamir’s Secret Splitting Scheme

Shamir proposed a threshold scheme based on polynomials. A polynomial of degree d is an expression

where a_d≠0. The numbers a₀,…,a_d are called the coefficients of f. A polynomial can be simultaneously regarded as a function and as an object determined by its vector of coefficients.

Interpolation is the process of finding a polynomial that goes through a given set of points. Fact Let (x₁,y₁),…,(x_k,y_k) be points, where all of the x_i’s are distinct. There is a unique polynomial f(x) of degree at most k - 1 that passes through all k points, that is, for which f(x_i) = y_i (1 ≤ 1 ≤ k).

f can be found using Lagrangian interpolation. This statement generalizes the familiar statement from high school geometry that two points determine a line.

Although this looks at first like a rational function, it is actually just a polynomial in x since the denominator contains only the x-values of the given points and not the variable x. δ_i(x) has the easily-checked property that δ_i(x_i) = 1, and δ_i(x_j) = 0 for j≠i. Then the polynomial

is the desired interpolating polynomial, since p(x_i) = y_i for i = 1,…,k. Note that to actually find the coefficients of p(x) when written in the canonical form of equation 1, it is necessary to expand p(x) by multiplying out the factors and collecting like terms.

Interpolation also works over finite fields, for example, Z_p for prime p. That is, any k points with distinct x coordinates determine a unique polynomial of degree at most k - 1 over Z_p. Of course, we must have k ≤ p since Z_p has only p distinct coordinate values in all.

Here’s how Shamir’s (τ,k) secret splitting scheme works. Let Alice (also called the dealer) have secret s. She constructs a polynomial of degree at most τ - 1 as follows: She sets a₀ = s, and she chooses a₁,…,a_τ-1

Z_p at random. Share s_i is the point (x_i,y_i), where x_i = i and y_i = f(i)(1 ≤ i ≤ k)¹ .

Theorem 2 s can be reconstructed from any set T of τ or more shares.

Theorem 3 Any set T′ of fewer than τ shares gives no information about s.

111 Secret Splitting with Dishonest Parties

Several variations on secret sharing have been studied. I mention two briefly but do not go into details.

111.1 Verifiable secret sharing

A dealer has a secret s which she wishes to share with a number of players. The dealer can of course always lie about the true value of her secret, but, as with bit commitment, the players want assurance that their shares do in fact code a unique secret. That is, whenever sufficiently many shares are assembled to reconstruct the secret, the same secret s is recovered, no matter which shares are used. In Shamir’s (τ,k) threshold scheme, this will be true only if all of the shares lie on a single polynomial of degree at most k - 1. However, if the dealer is dishonest and gives bad shares to some of the players, the resulting shares might not lie on any polynomial of degree k - 1 or smaller. The players have no way to discover this until later when they try to reconstruct s.

In verifiable secret sharing, the sharing phase is an active protocol involving the dealer and all of the players. At the end of this phase, either the dealer is exposed as being dishonest, or all of the players end up with shares that are consistent with a single secret. Needless to say, protocols for verifiable secret sharing are quite complicated.

111.2 Fault tolerance

Even if the dealer is assumed to be honest, there is still the problem of actively dishonest players. With Shamir’s scheme, a share that just disappears does not prevent the secret from being reconstructed, as long as enough valid shares remain. But if a player lies about his share and presents a corrupted share, then that share might be used by the other players in reconstructing an incorrect value for the secret. A fault-tolerant secret sharing scheme should allow the secret to be correctly reconstructed, even in the face of a certain number of corrupted shares.

Of course, it may be desirable to have schemes that can tolerate dishonesty in both dealer and a certain number of players. The interested reader is encouraged to explore the extensive literature on this subject.

112 Bit Commitment Problem

Alice and Bob want to play a game over the internet. Alice says, “I’m thinking of a bit. If you guess my bit correctly, I’ll give you $10. If you guess wrong, you give me $10.” Bob says, “Ok, I guess zero.” Alice replies, “Sorry, you lose. I was thinking of one.”

While this game may seem fair on the surface, there is nothing to prevent Alice from changing her mind after Bob makes his guess. Even if Alice and Bob play the game face to face, they still must do something to commit Alice to her bit before Bob makes his guess. For example, Alice might be required to write her bit down on a piece of paper and seal it in an envelope. After Bob makes his guess, he opens the envelope and knows whether he has won or lost. The act of writing down the bit commits Alice to that bit, even though Bob doesn’t learn its value until later.

The bit commitment problem is to implement an electronic form of sealed envelope called a commitment or blob or cryptographic envelope. Intuitively, a blob has two properties: (1) It is not possible to see the bit inside the blob without opening it. (2) It is not possible to change the bit inside the blob, that is, the blob cannot be opened in two different ways to reveal two different bits.

A blob is produced by a protocol commit(b) between Alice and Bob. We assume that b is initially private to Alice. At the end of the commit protocol, Bob has a blob c containing Alice’s bit b, but he should have no information about b’s value. Later, Alice and Bob can run a protocol open(c) to reveal the bit contained in c.

Alice and Bob do not trust each other, so each wants protection from cheating by the other. Alice wants to be sure that Bob cannot learn b after running commit(b), even if he misbehaves during the protocol. Bob wants to be sure that any successful run of open(c) reveals the same bit b′, so no matter what Alice does. Note that we do not require that Alice tell the truth about her private bit b. A dishonest Alice can always pretend her bit was b′≠b when producing c. But if she does, c can only be opened to b′, not to b.

113 Bit Commitment Using Symmetric Cryptography

A naïve way to use a symmetric cryptosystem for bit commitment is for Alice to commit b by encrypting it with a private key k to get a blob c = E_k(b). She later opens it using the decryption function D_k(c). Unfortunately, Alice can easily cheat if she can find a “colliding triple” (c,k₀,k₁) with the properties that D_k₀(c) = 0 and D_k₀(c) = 1. She just “commits” by sending c to Bob. Later, she can choose whether to open it to 0 or to 1 by sending Bob k₀ or k₁. This isn’t just a hypothetical problem. Suppose Alice uses the most secure cryptosystem of all, a one-time pad (lecture notes 3, section 14), so D_k(c) = c ⊕ k. Then she can easily find a colliding triple by choosing k₀ = c and k₁ = c ⊕ 1.

The protocol of Figure 113.1 tries to make it harder for Alice to cheat by making it possible for Bob to detect most bad keys.

	Alice		Bob


	To commit(b):
1.		.	Choose random string r.
2.	Choose random key k.
	Compute c = E_k(r ⋅ b).		c is commitment.



	To open(c):
3.	Send k.		Let r′⋅ b′ = D_k(c).
			Check r′ = r.
			b′ is revealed bit.

Figure 113.1:

Bit commitment using cryptosystem.

For many cryptosystems (e.g., DES), this protocol does indeed prevent Alice from cheating, for she will have difficulty finding any two keys k₀ and k₁ such that E_k₀(r ⋅ 0) = E_k₁(r ⋅ 1). However, for the one-time pad cryptosystem, she can cheat as before: She just takes c to be random and lets k₀ = c ⊕ (r ⋅ 0) and k₁ = c ⊕ (r ⋅ 1). Then D_{k_b}(c) = r ⋅ b for b

{0,1}, so the revealed bit is 0 or 1 depending on whether Alice sends k₀ or k₁ in step 3.

We see that not all secure cryptosystems have the properties we need in order to make the protocol of Figure 113.1 secure. We need a property analogous to the strong collision-free property for hash functions (lecture notes 14, section 78).