Peter Stone's Selected Publications

Classified by TopicClassified by Publication TypeSorted by DateSorted by First Author Last NameClassified by Funding Source


Relaxed Exploration Constrained Reinforcement Learning

Relaxed Exploration Constrained Reinforcement Learning.
Shahaf S. Shperberg, Bo Liu, and Peter Stone.
In International Conference on Autonomous Agents and Multiagent Systems, May 2023.

Download

[PDF]959.3kB  

Abstract

This extended abstract introduces a novel setting of reinforcement learning withconstraints, called Relaxed Exploration Constrained Reinforcement Learning(RECRL). As in standard constrained reinforcement learning (CRL), the aim is tofind a policy that maximizes environmental return subject to a set ofconstraints. However, in RECRL there is an initial training phase in which theconstraints are relaxed, thus the agent can explore the environment more freely.When training is done, the agent is deployed in the environment and is requiredto fully satisfy all constraints. As an initial approach to RECRL problems, weintroduce a curriculum-based approach, named CLiC, that can be applied toexisting CRL algorithms to improve their exploration during the training phasewhile allowing them to gradually converge to a policy that satisfies the full setof constraints. Empirical evaluation shows that CLiC produces policies with ahigher return during deployment than policies learned when training is done usingonly the strict set of constraints.

BibTeX Entry

@InProceedings{shahaf_shperberg_AAMAS_2023,
  author   = {Shahaf S. Shperberg and Bo Liu and Peter Stone},
  title    = {Relaxed Exploration Constrained Reinforcement Learning},
  booktitle = {International Conference on Autonomous Agents and Multiagent Systems},
  year     = {2023},
  month    = {May},
  location = {London, United Kingdom},
  abstract = {This extended abstract introduces a novel setting of reinforcement learning with
constraints, called Relaxed Exploration Constrained Reinforcement Learning
(RECRL). As in standard constrained reinforcement learning (CRL), the aim is to
find a policy that maximizes environmental return subject to a set of
constraints. However, in RECRL there is an initial training phase in which the
constraints are relaxed, thus the agent can explore the environment more freely.
When training is done, the agent is deployed in the environment and is required
to fully satisfy all constraints. As an initial approach to RECRL problems, we
introduce a curriculum-based approach, named CLiC, that can be applied to
existing CRL algorithms to improve their exploration during the training phase
while allowing them to gradually converge to a policy that satisfies the full set
of constraints. Empirical evaluation shows that CLiC produces policies with a
higher return during deployment than policies learned when training is done using
only the strict set of constraints.
  },
}

Generated by bib2html.pl (written by Patrick Riley ) on Wed Apr 17, 2024 18:42:49