Continual Learning in Reinforcement Environments
MARK BISHOP RING, A.B., M.S.C.S
Presented to the Faculty of the Graduate School of
The University of Texas at Austin
in Partial Fulfillment
of the Requirements
for the Degree of
Doctor of Philosophy
THE UNIVERSITY OF TEXAS AT AUSTIN
Continual learning is the constant development of complex
with no final end in mind. It is the process of learning ever more
skills by building on those skills already developed. In order for
at one stage of development to serve as the foundation for later
a continual-learning agent should learn hierarchically. CHILD, an
capable of Continual, Hierarchical, Incremental Learning and
is proposed, described, tested, and evaluated in this dissertation.
accumulates useful behaviors in reinforcement environments by using
Transition Hierarchies learning algorithm, also derived in the
This constructive algorithm generates a hierarchical, higher-order
network that can be used for predicting context-dependent temporal
and can learn sequential-task benchmarks more than two orders of
faster than competing neural-network systems. Consequently, CHILD
solve complicated non-Markovian reinforcement-learning tasks and can
transfer its skills to similar but even more complicated tasks,
these faster still. This continual-learning approach is made
the unique properties of Temporal Transition Hierarchies, which
skills to be amended and augmented in precisely the same way that
were constructed in the first place.
Available from Oldenbourg Verlag (Publishers): ISBN 3-486-23603-2.
The following are all compressed postscript files.
The dissertation is also available as a single
pdf (138 pages, 624 kbytes).
pages (pp. iv - xiv)
(pp. 1 - 7)
and Learning Tasks (pp. 8 - 16).
Learning (pp. 17 - 24).
Problems with Neural Networks (pp. 25 - 33).
Learning (pp. 34 - 44).
Construction of Sensorimotor Hierarchies (pp. 45 - 71).
6.1 Behavior Hierarchies (pp. 45 - 52).
6.2 Temporal Transition Hierarchies (pp. 52 - 69).
6.3 Conclusions (pp. 70 - 71).
(pp. 72 - 95).
7.1 Description of Simulation System (p. 72 - 73).
7.2 Supervised-Learning Tasks (pp. 73 - 82).
7.3 Continual Learning Results (pp. 82 - 95).
and Conclusions (pp. 96 - 107).
(pp. 108 - 117).
118 - 127).
Back to Mark
Ring's home page.