Jefferson Provost, Benjamin J. Kuipers and Risto Miikkulainen.
Developing navigation behavior through self-organizing
distinctive state abstraction.
Connection Science 18(2), 2006.
A major challenge in reinforcement learning research is to extend
methods that have worked well on discrete, short-range,
low-dimensional problems to continuous, high-diameter,
high-dimensional problems, such as robot navigation using
high-resolution sensors. Self-Organizing Distinctive-state
Abstraction (SODA) is a new, generic method by which a robot in a
continuous world can better learn to navigate by learning a set of
high-level features and building temporally-extended actions to carry
it between distinctive states based on those features. A SODA agent
first uses a self-organizing feature map to develop a set of
high-level perceptual features while exploring the environment with
primitive, local actions. The agent then builds a set of high-level
actions composed of generic trajectory-following and hill-climbing
control laws that carry it between the states at local maxima of
feature activations. In an experiment on a simulated robot navigation
task, the SODA agent learns to perform a task requiring 300
small-scale, local actions using as few as 9 new, temporally-extended
actions, significantly improving learning time over navigating with
the local actions.
Preprint (Adobe PDF format [.pdf], 1200K)
[QR home: http://www.cs.utexas.edu/users/qr]