This web page provides supplementary material to the following paper:

Published in the Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence

This page provides details on the optimization of walk parameters for an omnidirectional walk engine which was the key component in UT Austin Villa winning the 2011 RoboCup 3D simulation competition. Results from the competition, including videos of game action, are linked off the UT Austin Villa homepage. The remainder of this page focuses only on the learned walk.

For the 2011 RoboCup 3D simulation competition UT Austin Villa implemented and learned optimized parameters for an omnidirectional walk engine. This was a vast improvement over the 2010 team's fixed skills based walk.

In order to optimize the walk engine parameters the agent first learned parameters while measuring how far it could dribble a ball during a driveBallToGoal optimization task. Unfortunately dribbling a ball to the goal wasn't fully representative of the many situations encountered in an actual game and so the walk learned from this was not as stable as desired.

To better represent the many situations encountered in a game a new set of parameters was learned by having the agent move through an obstacle course consisting of a set series of target positions during a goToTarget optimization task. To increase the speed of the agent this same goToTarget optimization task was also used to learn a

Sample videos of the walk during different stages of learning can be found below.

Omnidirectional walk using initial (unoptimized) parameters that were tuned to work with the physical Nao robot. This agent lost to a team of agents with the 2010 fixed skills based walk by an average goal difference of -1.65 with a standard error of .11 across 100 games.

Download video: mp4

Omnidirectional walk using partial (11 out of 14) parameters that were learned through the driveBallToGoal optimization. This agent beat a team of agents with the initial walk parameters by an average goal difference of 1.72 with a standard error of .11 across 100 games. It lost to a team of agents with the 2010 fixed skills based walk, however, by an average goal difference of -.28 with a standard error of .07 across 100 games.

Download video: mp4

DriveBallToGoal agent dribbling the ball toward the goal while executing the driveBallToGoal optimization task. The agent's fitness is measured by how far it can dribble the ball toward the goal in 30 seconds. This agent beat a team of agents with the initial walk parameters by an average goal difference of 5.54 with a standard error of .14 across 100 games. It also beat a team of agents with the 2010 fixed skills based walk by an average goal difference of 2.99 with a standard error of .12 across 100 games.

Download video: mp4

Final agent navigating an obstacle course of targets it is told to move toward while executing the goToTarget optimization task. The agent's fitness is measured by how far/fast it can move toward each target (shown as a magenta dot on the field). It is penalized for any movement when told to stop and is also penalized if it falls over. This optimization was used to learn both the

Download video: mp4

Final agent dribbling the ball toward the goal from multiple starting points while executing the driveBallToGoal2 optimization task. The agent's fitness is measured by how far it can dribble each ball in 15 seconds toward the goal and is penalized if it dribbles the ball backwards. At the end of every 15 seconds the agent performs a set series of movements to check its stability and is penalized if it falls over. The optimization is run in simulation time which is much faster than real time. This optimization was used to learn the

Download video: mp4

Omnidirectional walk using final optimized parameters sets. The walk parameter set the agent is using is displayed above the agent: T (red) =

Download video: mp4

For any questions, please contact Patrick MacAlpine.