Deep Imitation Learning for Parameterized Action Spaces (2016)
Matthew Hausknecht, Yilun Chen, and Peter Stone
Recent results have demonstrated the ability of deep neural networks to serve as effective controllers (or function approximators of the value function) for complex sequential decision-making tasks, including those with raw visual inputs. However, to the best of our knowledge, such demonstrations have been limited to tasks either fully discrete or fully continuous actions. This paper introduces an imitation learning method to train a deep neural network to mimic a stochastic policy in a parameterized action space. The network uses a novel dual classification/regression loss mechanism to decide which discrete action to select as well as the continuous parameters to accompany that action. This method is fully implemented and tested in a subtask of simulated RoboCup soccer. To the best of our knowledge, the resulting networks represent the first demonstration of successful imitation learning in a task with parameterized continuous actions.
In AAMAS Adaptive Learning Agents (ALA) Workshop, Singapore, May 2016.

Peter Stone Faculty pstone [at] cs utexas edu