Part 1: Behavioral Strategy

Format: Jupyter notebook / written report.

In this homework, you will train an agent to participate in a vollyball tournament. The assinment will walk you through each step. Here the steps for the general structure of this homework:

The agents are evolved against one other player.
You will play against a random oponent. From the past homeworks, you should have learned (1) how to build an MLP in Numpy or Jax (2) how to evolve the MLP with evolutionary algorithms such as CMA-ES. Can you learn a simple MLP to beat a randomly controlled agent?
AlphaGo learned to master Go by self-play. It is also possible to do so in this volleyball game. Can you allow 2 agents to play with each other? Stay tuned...