Ad Hoc Teamwork: Pursuit
The pursuit domain involves a number of predators attempting to capture the prey. The version depicted here uses 4 predators and 1 prey moving around a grid world. The grid world is a torus, so moving off one edge brings the agent back on the other side. The green prey moves randomly, and the red predators attempt to surround the prey on all sides. The ad hoc agent playing as a predator is denoted with a yellow star. We compare the performance of PLASTIC-Model to the baseline of matching the teammates' behaviors.
Hand-Coded (HC) Teammates
This setting looks at a set of 4 possible hand-coded teammates. These teammates represent a variety of possible behaviors. GR (greedy) agents move deterministically towards the nearest open cell neighboring prey. TA (teammate-aware) agents give precedence to agents farther from the prey, assigning the farthest away predator to the closest cell neighboring the prey. GP (greedy probabilistic) agents move towards the nearest open cell neighboring the prey, but randomly selects the path to that cell, preferring shorter paths. PD (probabilistic destinations) randomly select a distance to be from the prey as well as randomly selecting a cell at that distance. Over time, the distance decreases and the PD agents surround the prey. We compare the performance of matching the teammates' behavior (Match) with planning using PLASTIC-Model given the 4 hand-coded models as expert knowledge.
Externally-created Teammates (Student-Broad)
These results are with externally-created teammates. These teammates were not written by us; instead, a class of students created these agents for a class project without considering ad hoc teamwork. The Student-Broad set includes 29 different teammate behaviors. Cooperating with this type of agents is a complex ad hoc teamwork scenario; this scenario evaluates whether an ad hoc agent can cooperate with teammates that may not adapt to it. In this setting with externally-created teammates, we show the performance of PLASTIC-Model given expert knowledge of the 4 hand-coded (HC) behaviors as well as when PLASTIC-Model learns models of its previous teammates and selects from these models on the fly (SetIncluding).