Peter Stone's Selected Publications

Classified by TopicClassified by Publication TypeSorted by DateSorted by First Author Last NameClassified by Funding Source


ProtoCRL: Prototype-based Network for Continual Reinforcement Learning

ProtoCRL: Prototype-based Network for Continual Reinforcement Learning.
Michela Proietti, Peter R. Wurman, Peter Stone, and Roberto Capobianco.
In Reinforcement Learning Conference, August 2025.

Download

[PDF]2.1MB  

Abstract

The purpose of continual reinforcement learning is to train an agent on asequence of tasks such that it learns the ones that appear later in the sequencewhile retaining the ability to perform the tasks that appeared earlier.Experience replay is a popular method used to make the agent remember previoustasks, but its effectiveness strongly relies on the selection of experiences tostore. Kompella et al. (2023) proposed organizing the experience replay bufferinto partitions, each storing transitions leading to a rare but crucial event,such that these key experiences get revisited more often during training.However, the method is sensitive to the manual selection of event states. Toaddress this issue, we introduce ProtoCRL, a prototype-based architectureleveraging a variational Gaussian mixture model to automatically discovereffective event states and build the associated partitions in the experiencereplay buffer. The proposed approach is tested on a sequence of MiniGridenvironments, demonstrating the agent's ability to adapt and learn new skillsincrementally.

BibTeX Entry

@InProceedings{protocrl_rlc2025,
  author   = {Michela Proietti and Peter R. Wurman and Peter Stone and Roberto Capobianco},
  title    = {ProtoCRL: Prototype-based Network for Continual Reinforcement Learning},
  booktitle = {Reinforcement Learning Conference},
  year     = {2025},
  month    = {August},
  location = {Edmonton, Canada},
  abstract = {The purpose of continual reinforcement learning is to train an agent on a
sequence of tasks such that it learns the ones that appear later in the sequence
while retaining the ability to perform the tasks that appeared earlier.
Experience replay is a popular method used to make the agent remember previous
tasks, but its effectiveness strongly relies on the selection of experiences to
store. Kompella et al. (2023) proposed organizing the experience replay buffer
into partitions, each storing transitions leading to a rare but crucial event,
such that these key experiences get revisited more often during training.
However, the method is sensitive to the manual selection of event states. To
address this issue, we introduce ProtoCRL, a prototype-based architecture
leveraging a variational Gaussian mixture model to automatically discover
effective event states and build the associated partitions in the experience
replay buffer. The proposed approach is tested on a sequence of MiniGrid
environments, demonstrating the agent's ability to adapt and learn new skills
incrementally.
  },
}

Generated by bib2html.pl (written by Patrick Riley ) on Thu Oct 02, 2025 22:46:25