Peter Stone's Selected Publications

Classified by TopicClassified by Publication TypeSorted by DateSorted by First Author Last NameClassified by Funding Source


Relaxed Exploration Constrained Reinforcement Learning

Relaxed Exploration Constrained Reinforcement Learning.
Shahaf S. Shperberg, Bo Liu, and Peter Stone.
In Conference on Autonomous Agents and Multiagent Systems, May 2024.

Download

[PDF]3.4MB  

Abstract

This research introduces a novel setting for reinforcement learning with constraints, termed Relaxed Exploration Constrained Reinforcement Learning (RECRL). Similar to standard constrained reinforcement learning (CRL), the objective in RECRL is to discover a policy that maximizes the environmental return while adhering to a predefined set of constraints. However, in some real-world settings, it is possible to train the agent in a setting that does not require strict adherence to the constraints, as long as the agent adheres to them once deployed. To model such settings, we introduce RECRL, which explicitly incorporates an initial training phase where the constraints are relaxed, enabling the agent to explore the environment more freely. Subsequently, during deployment, the agent is obligated to fully satisfy all constraints. To address RECRL problems, we introduce a curriculum-based approach called CLiC, designed to enhance the exploration of existing CRL algorithms during the training phase and facilitate convergence towards a policy that satisfies the full set of constraints by the end of training. Empirical evaluations demonstrate that CLiC yields policies with significantly higher returns during deployment compared to training solely under the strict set of constraints.

BibTeX Entry

@InProceedings{shahaf_shperberg_AAMAS_2024,
  author   = {Shahaf S. Shperberg and Bo Liu and Peter Stone},
  title    = {Relaxed Exploration Constrained Reinforcement Learning},
  booktitle = {Conference on Autonomous Agents and Multiagent Systems},
  year     = {2024},
  month    = {May},
  location = {Auckland, New Zealand},
  abstract = {This research introduces a novel setting for reinforcement learning with constraints, termed Relaxed Exploration Constrained Reinforcement Learning (RECRL). Similar to standard constrained reinforcement learning (CRL), the objective in RECRL is to discover a policy that maximizes the environmental return while adhering to a predefined set of constraints. However, in some real-world settings, it is possible to train the agent in a setting that does not require strict adherence to the constraints, as long as the agent adheres to them once deployed. To model such settings, we introduce RECRL, which explicitly incorporates an initial training phase where the constraints are relaxed, enabling the agent to explore the environment more freely. Subsequently, during deployment, the agent is obligated to fully satisfy all constraints. To address RECRL problems, we introduce a curriculum-based approach called CLiC, designed to enhance the exploration of existing CRL algorithms during the training phase and facilitate convergence towards a policy that satisfies the full set of constraints by the end of training. Empirical evaluations demonstrate that CLiC yields policies with significantly higher returns during deployment compared to training solely under the strict set of constraints.},
}

Generated by bib2html.pl (written by Patrick Riley ) on Wed Jun 10, 2026 15:26:44