Problem : Compute the position values for 0 through 16 for 1-row misere Nim where the legal moves are to take 1, 4, or 5 stones.
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
outcome class | N | P | N | P | N | N | N | N | N | P | N | P | N | N | N | N | N |
Problem : Consider a normal game played with $n$ rows of stones. On each turn, the current player can take any number of stones from any row, or may move any number of stones to the row directly above it (including to a row that has been emptied, not skipping over any row that has not been emptied, and not to a row that never existed).
- Explain why the state with 3 rows of one stone each cannot be
viewed simply as the sum of three subgames.
The rows are not independent games since stones can move between rows.
- Compute the Grundy numbers for all positions in this
game reachable from an initial position with 3 rows of 1 stone each.
Position
(top/middle/bottom)Moves Grundy Numbers of Moves Grundy Number 1/1/1 {2/0/1, 1/2/0, 0/1/1, 1/0/1, 1/1/0} {*0, *2, *0, *2, *0} *1 2/0/1 {2/1/0, 1/0/1, 0/0/1, 2/0/0} {*1, *2, *2, *2} *0 1/2/0 {3/0/0, 2/1/0, 0/2/0, 1/1/0, 1/0/0} {*3, *1, *1, *0, *1} *2 2/1/0 {3/0/0, 1/1/0, 0/1/0, 2/0/0} {*3, *0, *2, *2} *1 0/1/1 {1/0/1, 0/2/0, 0/0/1, 0/1/0} {*2, *1, *1, *2} *0 1/0/1 {1/1/0, 0/0/1, 1/0/0} {*0, *1, *1} *2 1/1/0 {2/0/0, 0/1/0, 1/0/0} {*2, *2, *1} *0 3/0/0 {2/0/0, 1/0/0, 0/0/0} {*2, *1, *0} *3 0/2/0 {2/0/0, 1/1/0, 0/1/0, 0/0/0} {*2, *0, *2, *0} *1 2/0/0 {1/0/0, 0/0/0} {*1, *0} *2 0/0/1 {0/1/0, 0/0/0} {*2, *0} *1 0/1/0 {1/0/0, 0/0/0} {*1, *0} *2 1/0/0 {0/0/0} {*0} *1 0/0/0 {} {} *0 - Compute a winning move in the summed game consisting of
two independent copies of the original game (stones cannot
move between copies), for the position with the 1st game in the inital
position of one with 3 rows with one stone each, and the other game with
just one stone left in the middle position (1/1/1 + 0/1/0), or determine
that no winning move exists. Repeat for (1/1/1 + 0/2/0).
1/1/1 + 0/1/0 is equivalent to *1 + *2. The Nim-sum (exclusive-or) of 1 and 2 is 3. We then need to move from *2 to *1 to make the new Nim-sum *1 + *1. One possible move is in the second game from 0/1/0 to 1/0/0.
1/1/1 + 0/2/0 is equivalent to *1 + *1. The Nim-sum is 0. So there is no winning move.
- How many positions with 5 total stones are reachable from 0/0/6?
The 5 stones can be in any of the three rows, so we can view the positions as multisets where each element in the multiset indicates the row a particular stone is in; for example 3/1/1 would be {1,1,1,2,3}. The number of size-5 multisets of {1,2,3} is ${7 \choose 5} = 21$.
Problem : Prove or disprove: for any position $G \approx *n$ and any $m \lt n$, there is an option $G'$ of $G$ such that $G + G' \approx *m$.
Problem :
- Coino is a game played with $n$ coins. On each turn,
the player chooses how many of the remaining coins to flip.
The player earns one point for each head.
If the flip results in at least as many tails as heads then
loses the flipped coins for the rest of the game.
The game is over when the player is out of coins or reaches $k$
points. The player wins after reaching $k$ points and loses
after running out of coins with $\lt k$ points. Write a program
that computes the strategy for Coino that maximizes the probability
of winning and determines that maximum probability.
def prob(heads, total): return len(list(itertools.combinations([i for i in range(total)], heads))) / 2 ** total def solve_coino(s, g): # e[c][t] will be the probability of winning given that we # have c coins left and a total of t e = [[0.0] * (g + 1) for c in range(s + 1)] opt = [[0] * (g + 1) for c in range(s + 1)] # have already won when t >= g (we clip the columns at g) for c in range(s + 1): e[c][g] = 1.0 # on every turn the number of coins goes down or the total goes up; # compute the rest of e in the opposite of that order # (increasing number of coins, decreasing total) for c in range(1, s + 1): for t in range(g - 1, -1, -1): # compute the value of each choice (the number to flip) values = [0.0] for f in range(1, c + 1): # prob is a function that gives the probability of # flipping exactly h heads out of f flipped coins value = sum(prob(h, f) * e[c - (f if h <= (f - h) else 0)][min(t + h, g)] for h in range(0, f + 1)) values.append(value) # determine which choice has the max value e[c][t] = max(values) opt[c][t] = max(enumerate(values), key=lambda p: p[1])[0] return e, opt
- Suppose we modify Coino
so that you get one do-over: at most once
during the game you may discard the result of a flip and reflip any number of coins
if the initial result was not to your liking.
How would you modify the algorithm to
find the optimal strategy and the expected winning
percentage in this case?
Let the goal be $g$ and the inital number of coins be $s$. Denote by $E(c,t,r)$ the expected winning percentage given that the game has reached the position with $c$ coins left, $t$ total heads, and $r$ reflips available. Then $E(c,t,r) = 1.0$ when $t >= g$, and $E(c,t,r) = 0.0$ when $c = 0$ and $t \lt g$. For $0 \lt c \lt s$ and $0 \le t \lt g$ we have $$E(c,t,0) = \max_{1 \le f \le c} \left(\sum_{0 \le h \le {f \over 2}}^f P(h {\rm \ heads\ out\ of\ } f)*E(c-f,t+h,0) + \sum_{{f \over 2} \lt h \le f} P(h {\rm \ heads\ out\ of\ } f)*E(c,t+h,0) \right)$$ as before, and $$E(c,t,1) = \max_{1 \le f \le c} \left(\sum_{0 \le h \le {f \over 2}}^f P(h {\rm \ heads\ out\ of\ } f)*\max(E(c,t,0), E(c-f,t+h,1)) + \sum_{{f \over 2} \lt h \le f} P(h {\rm \ heads\ out\ of\ } f)*\max(E(c,t,0),E(c,t+h,1)) \right)$$ where the inner maxes represent the option of taking the result of a flip or using the reflip.
$OPT(c,t,r)$ would record the argmax of the outer max for each entry to record the optimal choices of how many coins to flip; $OPT(c,t,r,f,h)$ would store the argmax of the inner max to record, for each position $(c,t,r)$, choice ($f$), and outcome of that choice ($h$), whether the optimal choice was to reflip or not.
Problem : Compute the minimax value of each node in the tree below. Squares represent max nodes and circles represent min nodes.

Problem : Show the operation of alpha-beta pruning on the tree shown below. Show the (alpha, beta) windows passed to each node as the tree is traversed, the values returned from each node, and which branches are pruned.

Node | $(\alpha, \beta)$ | Value |
---|---|---|
A | $(-\infty, \infty)$ | 7 |
B | $(-\infty, \infty)$ | 4 |
C | $(4, \infty)$ | 7 |
D | $(7, \infty)$ | 6 |
E | $(-\infty, \infty)$ | 4 |
F | $(-\infty, 4)$ [prunes 2nd child] | 6 |
G | $(-\infty, 4)$ [prunes 2nd child] | 7 |
H | $(4, \infty)$ | 7 |
I | $(4, 7)$ [prunes 2nd, 3rd children] | 9 |
J | $(7, \infty)$ | 9 |
K | $(7, 9)$ | 6 |
Problem : Consider a constant-sum game with payoff matrix $$\left( \begin{array}{ccc} 0 & 2 & -1 \\ 1 & -2 & 0 \\ -2 & 0 & 2 \end{array} \right)$$.
- Find $v^-$ and $v^+$ for this game.
$v^- = -1$, $v^+ = 1$
- Does this game have an equilibrium in pure strategies?
No, because $v^- \ne v^+$.
- Compute the value of $E(X,Y)$ for $X=(0 {1 \over 2} {1 \over 2})$ and $Y=({2 \over 3} 0 {1 \over 3})$.
$E(X, Y) = \frac{1}{2} \cdot \frac{2}{3} \cdot 1 + \frac{1}{2} \cdot \frac{1}{3} \cdot 0 + \frac{1}{2} \cdot \frac{2}{3} \cdot -2 + \frac{1}{2} \cdot \frac{1}{3} \cdot 2 = 0$
- Using Theorem 1.3.8c, verify that $X^*=({2 \over 5} {2 \over 5} {1 \over 5})$,$Y^*=({2 \over 5} {1 \over 5} {2 \over 5})$
is a saddle point for this game.
$E(X^*, Y^*) = 0$. \begin{array}{rcccccl} E(X^*, 1) & = & \frac{2}{5} \cdot 0 + \frac{2}{5} \cdot 1 + \frac{1}{5} \cdot -2 & = & 0 & \ge & 0 \\ E(X^*, 2) & = & \frac{2}{5} \cdot 2 + \frac{2}{5} \cdot -2 + \frac{1}{5} \cdot 0 & = & 0 & \ge & 0 \\ E(X^*, 3) & = & \frac{2}{5} \cdot -1 + \frac{2}{5} \cdot 0 + \frac{1}{5} \cdot 2 & = & 0 & \ge & 0 \\ E(1, Y^*) & = & \frac{2}{5} \cdot 0 + \frac{1}{5} \cdot 2 + \frac{2}{5} \cdot -1 & = & 0 & \le & 0 \\ E(2, Y^*) & = & \frac{2}{5} \cdot 1 + \frac{1}{5} \cdot -2 + \frac{2}{5} \cdot 0 & = & 0 & \le & 0 \\ E(3, Y^*) & = & \frac{2}{5} \cdot -2 + \frac{1}{5} \cdot 0 + \frac{2}{5} \cdot 2 & = & 0 & \le & 0 \\ \end{array} All inequalities are true, so $X^*, Y^*$ is indeed a saddle point.
- Find player I's best response to $Y=({1 \over 3} {1 \over 3} {1 \over 3})$.
\begin{array}{rcccl} E(1, Y) & = & \frac{1}{3} \cdot 0 + \frac{1}{3} \cdot 2 + \frac{1}{3} \cdot -1 & = & \frac{1}{3} \\ E(2, Y) & = & \frac{1}{3} \cdot 1 + \frac{1}{3} \cdot -2 + \frac{1}{3} \cdot 0 & = & -\frac{1}{3} \\ E(3, Y) & = & \frac{1}{3} \cdot -2 + \frac{1}{3} \cdot 0 + \frac{1}{3} \cdot 2 & = & 0 \\ \end{array} The maximum value is $E(1, Y)$, so the best response is $(1\ 0\ 0)$.
Problem : Find a saddle point and its value for the constant-sum game with payoff matrix $$ \left( \begin{array}{cc} 3 & 2 \\ 0 & 4 \\ \end{array} \right) $$
For $Y$ the inequalities are \begin{array}{rcccl} E(1, Y) &=& y + 2 &\le& v \\ E(2, Y) &=& 4 - 4y &\le& v \\ \end{array} The minimum value of $v$ above both lines is at their intersection, which is where $y + 2 = 4 - 4y$, or $y = \frac{2}{5}$, so $Y^* = (\frac{2}{5} \frac{3}{5})$.
$E(X^*, Y^*) = \frac{12}{5}$.
Problem : Set up the linear program to find a saddle point for the constant-sum game with payoff matrix $$ \left( \begin{array}{ccc} 4 & 7 & 3 \\ 0 & 5 & 4 \\ 10 & 6 & 2 \\ \end{array} \right) $$
Minimize $p_1 + p_2 + p_3$ subject to \begin{array}{rcl} -5p_1 - p_2 - 11p_3 & \le & -1 \\ -8p_1 - 6p_2 - 7p_3 & \le & -1 \\ -4p_1 - 5p_2 - 3p_3 & \le & -1 \\ p_1, p_2, p_3 & \ge & 0 \\ p_1, p_2, p_3 & \le & 1 \\ \end{array}
Minimize $-q_1 - q_2 - q_3$ subject to \begin{array}{rcl} 5q_1 + 8q_2 + 4q_3 & \le & 1 \\ q_1 + 6q_2 + 5q_3 & \le & 1 \\ 11q_1 + 7q_2 + 3q_3 & \le & 1 \\ q_1, q_2, q_3 & \ge & 0 \\ q_1, q_2, q_3 & \le & 1 \\ \end{array}