Grounded Action Transformations

Robot learning in simulation is a promising alternative to the prohibitive sample cost of learning in the physical world. Unfortunately, policies learned in simulation often perform worse than hand-coded policies when applied on the physical robot. Grounded simulation learning (GSL) promises to address this issue by altering the simulator to better match the real world. This paper proposes a new algorithm for GSL -- Grounded Action Transformation -- and applies it to learning of humanoid bipedal locomotion. Our approach results in a 43.27% improvement in forward walk velocity compared to a state-of-the art hand-coded walk. We further evaluate our methodology in controlled experiments using a second, higher-fidelity simulator in place of the real world. Our results contribute to a deeper understanding of grounded simulation learning and demonstrate its effectiveness for learning robot control policies.

Video of learned walks:

Full details of our approach are available in the following paper:

Grounded Action Transformation for Robot Learning in Simulation.
Josiah Hanna and Peter Stone.
In Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI-17), February 2017.
    BibTeX     Download: [pdf] (714.7kB )

Humanoid Robots Learning to Walk Faster: From the Real World to Simulation and Back

Simulation is often used in research and industry as a low cost, high efficiency alternative to real model testing. Simulation has also been used to develop and test powerful learning algorithms. However, parameters learned in simulation often do not translate directly to the application, especially because heavy optimization in simulation has been observed to exploit theinevitable simulator simplifications, thus creating a gap between simulation andapplication that reduces the utility of learning in simulation. This paper introduces Grounded Simulation Learning (GSL), an iterative optimization framework for speeding up robot learning using an imperfect simulator. In GSL, a behavior is developed on a robot and then repeatedly: 1) the behavior is optimized in simulation; 2) the resulting behavior is testedon the real robot and compared to the expected results from simulation, and 3) the simulator is modified, using a machine-learning approach to come closer in line with reality. This approach is fully implemented and validated on the taskof learning to walk using an Aldebaran Nao humanoid robot. Starting from a set of stable, hand-coded walk parameters, four iterations of this three-step optimization loop led to more than a 25% increase in the robot's walking speed.

Original Walk:

Optimized Walk:

Full details of our approach are available in the following paper:

Valid CSS!
Valid XHTML 1.0!