Self-Organizing Perceptual and Temporal Abstraction for Robot Reinforcement Learning (2004)
Jefferson Provost, Benjamin J. Kuipers and Risto Miikkulainen
A major current challenge in reinforcement learning research is to extend methods that work well on discrete, short-range, low-dimensional problems to continuous, highdiameter, high-dimensional problems, such as robot navigation using high-resolution sensors. We present a method whereby an robot in a continuous world can, with little prior knowledge of its sensorimotor system, environment, and task, improve task learning by first using a self-organizing feature map to develop a set of higher-level perceptual features while exploring using primitive, local actions. Then using those features, the agent can build a set of high-level actions that carry it between perceptually distinctive states in the environment. This method combines a perceptual abstraction of the agent?’s sensory input into useful perceptual features, and a temporal abstraction of the agent?’s motor output into extended, high-level actions, thus reducing both the dimensionality and the diameter of the task. An experiment on a simulated robot navigation task shows that the agent using this method can learn to perform a task requiring 300 small-scale, local actions using as few as 7 temporally-extended, abstract actions, significantly improving learning time.
In AAAI-04 Workshop on Learning and Planning in Markov Processes 2004.

Risto Miikkulainen Faculty risto [at] cs utexas edu
Jefferson Provost Ph.D. Alumni jefferson provost [at] gmail com