Homework 7: Tournament

Due date: November 12, 11:59 p.m.

Questions: Post to Piazza.

Format: Jupyter notebook / written report.

In this homework, you will train an agent to participate in a vollyball tournament. This is the second step of this assignment.

In step one (assignment 6), you trained a neural network to beat the random agent.
The ultimate goal of this exercise is for you to compete with other students in the class with your best agent. The agent you get from step one is only stronger than the random agent, so in step two, you will leverage self-play to enhance the trained agent. The instructions in detail are the following:
1. Initialize a neuroevolution algorithm, e.g., CMA-ES, with the parameters from the trained agent.
2. During the evolution process, instead of having the entire population play against the random agent as was done in step one, split the population randomly into N / 2 pairs, where N is the size of the population, and have the pairs play against each other. If this is repeated M times, each individual should have M scores and the average is treated as its fitness for evolution.
3. The last code cell in the notebook is the start code for the 2 steps above.
Once you are finished with self-play training, you can choose to play against the pre-trained agent, which is in cell 5, to have a feeling of how strong your agent is. And you can use the same code for in-class tournaments.

You can download the notebook here: Notebook ipynb file