Using Dynamic Rewards to Learn a Fully Holonomic Bipedal Walk

Patrick MacAlpine and Peter Stone

In the AAMAS Adaptive Learning Agents (ALA) Workshop in Valenica, Spain, June 2012.

The full paper can be found here

This page provides details on the optimization of walk parameters for a fully holonomic walk (omnidirectional walk which can walk at the same maximum velocity in all directions). The walk was learned using the goToTarget optimization and the CMA-ES policy search algorithm. Rewards for walking speed in the three cardinal directions of forward, backwards, and sideways were dynamically reweighted from one generation to the next during learning in order to encourage equal walking speeds in all directions.

Holonomic Walk

Fully Holonomic walk able to move in all directions with equal maximum velocity.
Download video: ogv, mp4

For any questions, please contact Patrick MacAlpine.