PS7 GRADING RUBRIC Question 1: - 1 point for having the general notion of experience replay and mentioning how DQN uses it. Only 1/2 if they don’t mention how it applies to DQN. - 1/2 point for advantage of experience replay that makes sense - 1/2 point for disadvantage of experience replay that makes sense Question 2: -1 point for saying its a t-SNE plot to visualize high-dimensional states. -1 point for mentioning its significance in someway or trying to explain a certain detail in the plot Question 3: -1/2 point for correctly stating the motivation of epsilon-greedy (exploring action space) -1/2 point for saying why e decreases over time (in late training stages, agent has done enough exploring of action space to know it) -1 point for an alternative scheme that makes sense and/or sounds plausible. The scheme has to be algorithmically different from random epsilon-greedy. If they, for instance, just describe another way to implement random epsilon-greedy, that is wrong. Question 4: Since there were many options and that the papers were pretty difficult and there are many things to focus on, this is graded on “effort.” Most people should get 5/5 here if they made the effort. Here we only subtract points: -Subtract 2 points if they included citations inside the response and their “real” word count is below the required. -Subtract 1 point if they have very, very overly repetitive statements that looks like they just repeated a single concept to pad their response to fit the word limit. -Subtract 5 points if the response is completely incoherent or makes no sense