Assignment 2 - NFL Strategy Two-Minute Drill
Objectives
- to apply backwards indiction
- to use linear programming to find an equilibrium for a two-player simultaneous game
Introduction
NFL Strategy is a football (North American gridiron football) simulation game from the 1970s. The player on offense chooses one of 34 plays; the player on defense simultaneously chooses one of 12 defensive plays. The outcome of the play is determined by the combination of plays selected and an element of chance.
We seek strategies for a simulation of a situation in which the team currently on offense (the Patriots) must score a touchdown to win the game against the team currently on defense (the Eagles): there are 2 minutes (120 seconds) remaining in the game and the Patriots must gain 80 yards in order to score a touchdown and win the game. We make the simplifying assumptions that neither teams has any time-outs left, that any turnover, safety, or field goal leads to a win for the Eagles (we assume the Patriots will not get the ball back after such situations), that a touchdown wins the game for the Patriots no matter how much time is remaining (the Eagles will not be able to regain the lead against the Patriots defense), and that there will be no penalties. (An offense in such a situation is said to be executing a "two-minute drill".)
To simulate the quick decision making that happens during such a hurry-up situation, we limit the offense and defense to three plays each: a run, short pass, and long pass for the offense, and a run, balanced, and long pass defense for the defense.
Assignment
Write a program called "TwoMinute" that determines strategies for any position in the two-minute drill. The position will be given as four integer command-line arguments in the following order after the first argument, which specifies the mode:- 1, 2, 3, or 4 for the current down,
- a positive integer for the yards required for a first down,
- an integer between 1 and 99 for the yards required to score a touchdown, and
- a multiple of 5 between 5 and 120 for the time remaining in the game, in seconds.
Your program can be run in three modes determined by the first
command-line argument. The first command-line argument will be either
-random_defense
, -matrix
, or -equilibrium
.
- In
-random_defense
mode, your program computes the optimal strategy for the given position for an offense playing against a defense that randomly and uniformly selects one of the defensive plays. The output should be two lines: the first containing the index of the optimal play for the offense to call (0, 1, or 2), and the second containing the corresponding expected winning percentage for the offense to six decimal places. - In
-matrix
mode, your program computes the payoff matrix for a particular position. The output should be the payoff matrix for the position on the command-line, with each entry given to six decimal places. Each row should be output as a comma-separated list of values enclosed in square brackets, with each row on a separate line and output in order. - In
-equilibrium
mode, your program computes the equilibrium strategy. The output should be three lines: the equilibrium mixed strategy for the offensive player, given as a comma-separated list of probabilities given to six decimal places and enclosed in square brackets, the equilibrium mixed strategy for the defense in the same format as for the offense, and the value (probability of the offense winning) of the input position to six decimal places.
The outcomes of all combinations of the three plays are given in the module
plays.py
. That module defines a 2-D array
(list of lists) called plays
where plays[i][j]
gives the
possible outcomes for offensive play i
when run against defensive play
j
. The possible outcomes are given as a list of five triples,
where each triple gives the number of yards gained (which may be negative),
the elapsed time, given in five-second intervals (so 2 for 10 seconds), and
whether the play resulted in a turnover (thus winning the game for the defense).
The probability of each of the five possible outcomes is given in the
list prob
, where prob[k]
gives the probability of outcome
plays[i][j][k]
occurring when play i
is run against
play j
.
We view each play as a simultaneous constant-sum game in which the offensive player seeks to maximize their chance of winning and the defensive player wants to minimize that chance. Each entry in the payoff matrix is constructed by taking the average, weighted by probability, of the values (chance of the offense winning) of the positions that can result from the corresponding combination of plays. Since the payoff matrix for any position depends on the payoff matrix for positions closer to the end of the game, we can solve this game with backward induction (dynamic programming).
To help reduce the computations required, we provide a file containing the values
achieved by the equilibrium strategies
for each position in the two-minute drill in the CSV file
two_minute.csv
in /c/cs474/hw3
on the Zoo.
Each line in the CSV file
gives a position as the number of yards needed to score, the downs left (counting
down, with 4 being the start of a set of downs; this is not the same as the input
argument, which gives the down using the traditional football terminology,
where 1st down is the start of a set of downs), the yards needed for a new set
of downs, the time remaining in 5-second increments (so 24 for two minutes), and
the probability that the player on offense wins from that position if both players
follow equilibrium strategies. This data is also given as a pickled Python 3
dictionary in two_minute.pickle
; the keys
in that dictionary are tuples made from the first four values in the CSV
file and the value is the value in the CSV file. These files are large, so you
should create symbolic links to them. You may assume that those links have
been created in the directory from which your submissions will be run.
Use the following rules for special cases of play outcomes:
- the offense scores a touchdown by reducing the number of yards needed to score to zero or less;
- if the offense needs 100 or more yards to score as the result of any play, then the defense wins (in this situation, the Eagles earn two points for a safety and get the ball back); and
- the outcome of any play that starts with positive time left stands (the game is not over until the conclusion of the play during which the time went to zero).
The input position may not be reachable from the starting position described above. For example, with the given plays and resulting outcomes, it is impossible to gain 79 yards in five seconds to end up with 1st-and-goal from the 1 yard line with 1:55 remaining in the game. Treat such positions like any reachable position.
Examples
[jrg94@hippo TwoMinute]$ ./TwoMinute -random_defense 3 1 71 90 2 0.550023 [jrg94@hippo TwoMinute]$ ./TwoMinute -random_defense 4 1 71 85 0 0.504128 [jrg94@hippo TwoMinute]$ ./TwoMinute -random_defense 1 10 20 30 2 0.693844 [jrg94@hippo TwoMinute]$ ./TwoMinute -random_defense 3 1 1 10 0 0.933333 [jrg94@hippo TwoMinute]$ ./TwoMinute -random_defense 4 1 1 5 0 0.933333 [jrg94@hippo TwoMinute]$ ./TwoMinute -matrix 1 10 80 120 [0.348756, 0.327767, 0.361143] [0.283918, 0.329217, 0.395910] [0.430980, 0.356945, 0.340160] [jrg94@hippo TwoMinute]$ ./TwoMinute -equilibrium 1 10 80 120 [0.000000, 0.201071, 0.798929] [0.000000, 0.667845, 0.332155] 0.351369 [jrg94@hippo TwoMinute]$ ./TwoMinute -equilibrium 3 1 71 90 [0.276881, 0.000000, 0.723119] [0.000000, 0.729935, 0.270065] 0.322379 [jrg94@hippo TwoMinute]$ ./TwoMinute -equilibrium 4 1 71 85 [0.810775, 0.000000, 0.189225] [0.940228, 0.059772, 0.000000] 0.270639 [jrg94@hippo TwoMinute]$ ./TwoMinute -equilibrium 1 10 20 30 [0.000000, 0.158501, 0.841499] [0.000000, 0.703189, 0.296811] 0.584721 [jrg94@hippo TwoMinute]$ ./TwoMinute -equilibrium 3 1 1 10 [0.393939, 0.000000, 0.606061] [0.848485, 0.000000, 0.151515] 0.830303 [jrg94@hippo TwoMinute]$ ./TwoMinute -equilibrium 4 1 1 5 [1.000000, 0.000000, 0.000000] [1.000000, 0.000000, 0.000000] 0.800000