-


> > > > Examples > Exam #2 Practice

Problem 0: Review old practice problems, assignments (especially MCTS), and exams. For reference, the averages on the problems from the first exam were

12345678910
9.89.08.97.48.49.98.86.19.48.3

Problem : Condsider the multi-armed bandit problem with 3 arms with payoffs determined by the outcomes of rolling a fair 6-, 8-, or 10-sided die. Show that the greedy strategy does not achieve zero average regret for this setup.

Problem : Describe modifcations that would be necessary to standard MCTS for 2-player Yahtzee.

Problem : Pick an two enhancements to standard MCTS and describe how they might be applied to NFL Strategy, Kalah, or Yahtzee.

Problem : Consider using coevolution to optimize a set of numeric parameters in a heuristic for chess. Describe some ways to get useful information from playing two members of the population against each other multiple times when ordinarily the resulting players are deterministic so each game would follow the same sequence of moves.

Problem : Describe how you would set up the inputs to a neural network that chooses plays for NFL Strategy. Describe how you would use the outputs to choose a play.

Problem : How did DeepMind address the issue of overfitting when training the value network they used in AlphaGo?


Valid HTML 4.01 Transitional