Half Field Offense: An Environment for Multiagent Learning and Ad Hoc Teamwork
Half Field Offense is a subtask
simulated soccer, modeling a situation in which the offense team
attempts to score goals against the defense. HFO
features cooperative multiagent reinforcement learning, automated
AI teammates and opponents, continuous state spaces, discrete,
continuous, and parameterized action spaces, choice between partially
observed and fully observed world state, ability to play offense or
defense, and active communication. Half Field Offense is an
a simpler task that has been studied in the past.
Task DescriptionIn Half Field Offense, an offense team has to outsmart a defense team players, including a goalie, to score a goal. The task is played over one half of the soccer field, and begins near the half field line, with the ball close to one of the offense players. The offense team tries to maintain possession, move up the field, and score. The defense team tries to take the ball away from the offense team.
The task is episodic, and an episode ends when one of four events occurs:Each player acts autonomously - independently perceiving the world, selecting actions, and receiving rewards. However, agents can verbally communicate. It is up to you to design (or learn) a communication protocol.
Automated AI teammates are provided courtesy of the Helios RoboCup team, winner of the 2010 and 2012 RoboCup-2d championships.
Continuous state spaces feature angles and distances to objects of interest in the environment such as balls, goals, and players. A low-level-state-space contains 58 continuous raw-features and a high-level-state-space has 10 highly-informative features. The size of both state spaces increases as a function of the number of players in the game.
There are three choices of action space in HFO: a low-level-parameterized-action-space features action primitives such as Dash(power,direction), Kick(power,direction), Turn(direction) that require continuous parameters direction, power. A mid-level-action-space presents more sophisticated parameterized actions Kick_To(targetx,targety), Move_To(targetx,targety), Dribble_To(targetx,targety). Finally a high-level-discrete-action-space contains discrete actions Move, Dribble, Shoot, Pass that operate according to a preset strategy.
The state of the world is observed through the agent's view cone (the transparent wedge emanating from the player). Thus, by default, the world is partially observed. HFO also features a fullstate mode that provides a complete, noise-free, fully-observed-state.
Learning agents can play offense, defense, or both. They can be integrated with AI teammates and opponents in any combination. This flexibility presents opportunities for single-agent learning, multiagent learning, Ad Hoc teamwork, self play, coevolution.
CodeHFO architecture consists of a soccer server, which all learning agents and AI-controlled players (NPCs) connect to. Additionally, a trainer keeps track of the HFO episodes and is responsible for starting and stopping games. Finally, a visualizer may be used to view the game as it progresses. The HFO release includes example random, hand-coded, and Sarsa agents.
Benchmark ResultsBenchmark results for a variety of HFO tasks are provided in the publication Half Field Offense: An Environment for Multiagent Learning and Ad Hoc Teamwork below.
Random Agent (High-Level State Action Space): 1v0 1v1 2v2 3v3
Project ParticipantsThis page is maintained by Matthew Hausknecht (email@example.com). Other project participants, past and present, include Peter Stone, Shivaram Kalyanakrishnan, Prannoy Mupparaju, Sandeep Subramanian, Sanmit Narvekar, Siddharth Aravindan, and Samuel Barrett.
PublicationsIf HFO is useful in your research, consider citing:
Half Field Offense: An Environment for Multiagent Learning and Ad
Matthew Hausknecht, Prannoy Mupparaju, Sandeep Subramanian, Shivaram Kalyanakrishnan, and Peter Stone; Adaptive Learning Agents 2016