Peter Stone's Selected Publications

• Classified by Topic • Classified by Publication Type • Sorted by Date • Sorted by First Author Last Name • Classified by Funding Source •

Improving Action Selection in MDP's via Knowledge Transfer

Improving Action Selection in MDP's via Knowledge Transfer.
Alexander A. Sherstov and Peter Stone.
In Proceedings of the Twentieth National Conference on Artificial Intelligence, July 2005.
AAAI 2005

Download

[PDF]205.5kB [postscript]774.2kB

Abstract

Temporal-difference reinforcement learning (RL) has been successfully applied in several domains with large state sets. Large action sets, however, have received considerably less attention. This paper demonstrates the use of knowledge transfer between related tasks to accelerate learning with large action sets. We introduce action transfer, a technique that extracts the actions from the \mbox(near-)optimal solution to the first task and uses them in place of the full action set when learning any subsequent tasks. When optimal actions make up a small fraction of the domain's action set, action transfer can substantially reduce the number of actions and thus the complexity of the problem. However, action transfer between dissimilar tasks can be detrimental. To address this difficulty, we contribute randomized task perturbation (RTP), an enhancement to action transfer that makes it robust to unrepresentative source tasks. We motivate RTP action transfer with a detailed theoretical analysis featuring a formalism of related tasks and a bound on the suboptimality of action transfer. The empirical results in this paper show the potential of RTP action transfer to substantially expand the applicability of RL to problems with large action sets.

BibTeX Entry

@InProceedings(AAAI05-actions,
        author="Alexander A.\ Sherstov and Peter Stone",
        title="Improving Action Selection in {MDP}'s via Knowledge Transfer",
        booktitle="Proceedings of the Twentieth National Conference on Artificial Intelligence",
        month="July",year="2005",
        abstract={
                  Temporal-difference reinforcement learning (RL) has
                  been successfully applied in several domains with
                  large \emph{state} sets. Large \emph{action} sets,
                  however, have received considerably less attention.
                  This paper demonstrates the use of knowledge
                  transfer between related tasks to accelerate
                  learning with large action sets.  We introduce
                  \emph{action transfer}, a technique that extracts
                  the actions from the \mbox{(near-)optimal} solution
                  to the first task and uses them in place of the full
                  action set when learning any subsequent tasks.  When
                  optimal actions make up a small fraction of the
                  domain's action set, action transfer can
                  substantially reduce the number of actions and thus
                  the complexity of the problem. However, action
                  transfer between \emph{dissimilar} tasks can be
                  detrimental. To address this difficulty, we
                  contribute \emph{randomized task perturbation}
                  (RTP), an enhancement to action transfer that makes
                  it robust to unrepresentative source tasks. We
                  motivate RTP action transfer with a detailed
                  theoretical analysis featuring a formalism of
                  related tasks and a bound on the suboptimality of
                  action transfer.  The empirical results in this
                  paper show the potential of RTP action transfer to
                  substantially expand the applicability of RL to
                  problems with large action sets.
                 },
        wwwnote={<a href="http://www.aaai.org/Conferences/National/2005/aaai05.html">AAAI 2005</a>},
)

Generated by bib2html.pl (written by Patrick Riley ) on Mon Feb 02, 2026 12:00:26