Half Field Offense: An Environment for Multiagent Learning and Ad Hoc Teamwork

View on Github      View Documentation      UT Austin Villa

Half Field Offense is a subtask in RoboCup simulated soccer, modeling a situation in which the offense team attempts to score goals against the defense. HFO features cooperative multiagent reinforcement learning, automated AI teammates and opponents, continuous state spaces, discrete, continuous, and parameterized action spaces, choice between partially observed and fully observed world state, ability to play offense or defense, and active communication. Half Field Offense is an extension of Keepaway, a simpler task that has been studied in the past.

Task Description

In Half Field Offense, an offense team has to outsmart a defense team players, including a goalie, to score a goal. The task is played over one half of the soccer field, and begins near the half field line, with the ball close to one of the offense players. The offense team tries to maintain possession, move up the field, and score. The defense team tries to take the ball away from the offense team.

The task is episodic, and an episode ends when one of four events occurs:

  • A goal is scored,
  • The ball is out of bounds,
  • A defender gets possession of the ball (including the goalie catching the ball),
  • A max time limit for the episode is reached.
  • Each player acts autonomously - independently perceiving the world, selecting actions, and receiving rewards. However, agents can verbally communicate. It is up to you to design (or learn) a communication protocol.

    Automated AI teammates are provided courtesy of the Helios RoboCup team, winner of the 2010 and 2012 RoboCup-2d championships.

    Continuous state spaces feature angles and distances to objects of interest in the environment such as balls, goals, and players. A low-level-state-space contains 58 continuous raw-features and a high-level-state-space has 10 highly-informative features. The size of both state spaces increases as a function of the number of players in the game.

    There are three choices of action space in HFO: a low-level-parameterized-action-space features action primitives such as Dash(power,direction), Kick(power,direction), Turn(direction) that require continuous parameters direction, power. A mid-level-action-space presents more sophisticated parameterized actions Kick_To(targetx,targety), Move_To(targetx,targety), Dribble_To(targetx,targety). Finally a high-level-discrete-action-space contains discrete actions Move, Dribble, Shoot, Pass that operate according to a preset strategy.

    The state of the world is observed through the agent's view cone (the transparent wedge emanating from the player). Thus, by default, the world is partially observed. HFO also features a fullstate mode that provides a complete, noise-free, fully-observed-state.

    Learning agents can play offense, defense, or both. They can be integrated with AI teammates and opponents in any combination. This flexibility presents opportunities for single-agent learning, multiagent learning, Ad Hoc teamwork, self play, coevolution.

    Code

    HFO architecture consists of a soccer server, which all learning agents and AI-controlled players (NPCs) connect to. Additionally, a trainer keeps track of the HFO episodes and is responsible for starting and stopping games. Finally, a visualizer may be used to view the game as it progresses. The HFO release includes example random, hand-coded, and Sarsa agents.

    Benchmark Results

    Benchmark results for a variety of HFO tasks are provided in the publication Half Field Offense: An Environment for Multiagent Learning and Ad Hoc Teamwork below.

    Random Agent (High-Level State Action Space): 1v0 1v1 2v2 3v3

    Project Participants

    This page is maintained by Matthew Hausknecht (mhauskn@cs.utexas.edu). Other project participants, past and present, include Peter Stone, Shivaram Kalyanakrishnan, Prannoy Mupparaju, Sandeep Subramanian, Sanmit Narvekar, Siddharth Aravindan, and Samuel Barrett.

    Publications

    If HFO is useful in your research, consider citing:

    Half Field Offense: An Environment for Multiagent Learning and Ad Hoc Teamwork;
    Matthew Hausknecht, Prannoy Mupparaju, Sandeep Subramanian, Shivaram Kalyanakrishnan, and Peter Stone; Adaptive Learning Agents 2016
    (pdf|bibtex)

    Half Field Offense in RoboCup Soccer: A Multiagent Reinforcement Learning Case Study;
    Shivaram Kalyanakrishnan, Yaxin Liu, and Peter Stone, RoboCup International Symposium 2006
    (ps|pdf|bibtex)