Peter Stone's Selected Publications

Classified by TopicClassified by Publication TypeSorted by DateSorted by First Author Last NameClassified by Funding Source


State Abstraction Synthesis for Discrete Models of Continuous Domains

Jacob Menashe and Peter Stone. State Abstraction Synthesis for Discrete Models of Continuous Domains. In Data Efficient Reinforcement Learning Workshop at AAAI Spring Symposium, March 2018.

Download

[PDF]538.3kB  [postscript]5.3MB  

Abstract

Reinforcement Learning (RL) is a paradigm for enabling autonomous learning wherein rewards are used to influence an agent's action choices in various states. As the number of states and actions available to an agent increases, so it becomes increasingly difficult for the agent to quickly learn the optimal action for any given state. One approach to mitigating the detrimental effects of large state spaces is to represent collections of states together as encompassing "abstract states". State abstraction itself leads to a host of new challenges for an agent. One such challenge is that of automatically identifying new abstractions that balance generality and specificity; the agent must identify both the similarities and the differences between states that are relevant to its goals, while ignoring unnecessary details that would otherwise hinder the agent's progress. We call this problem of identifying useful abstract states the Abstraction Synthesis Problem (ASP). State abstractions can provide a significant benefit to modelbased agents by simplifying their models. T-UCT, a hierarchical model-learning algorithm for discrete, factored domains, is one such method that leverages state abstractions to quickly learn and control an agent's environment. Such abstractions play a pivotal role in the success of T-UCT; however, T-UCT's solution to ASP requires a fully discrete state space. In this work we develop and compare enhancements to T-UCT that relax its assumption of discreteness. We focus on solving ASP in domains with multidimensional, continuous state factors, using only the T-UCT agent's limited experience histories and minimal knowledge of the domain's structure. Finally, we present a new abstraction synthesis algorithm, RCAST, and compare this algorithm to existing approaches in the literature. We provide the algorithmic details of RCAST and its subroutines, and we show that RCAST outperforms earlier approaches to ASP by enabling T-UCT to accumulate significantly greater total reward with minimal expert configuration and processing time.

BibTeX Entry

@inproceedings{DERL18-Menashe,
	author = {Jacob Menashe and Peter Stone},
	title = {State Abstraction Synthesis for Discrete Models of Continuous Domains},
	booktitle = {Data Efficient Reinforcement Learning Workshop at AAAI Spring Symposium},
  location = {Stanford, CA, USA},
  month = {March},
	year = {2018},
	keywords = {Hierarchical Reinforcement Learning; Model-based Reinforcement Learning; State Space Abstractions for Reinforcement Learning; Bayesian Networks},
	abstract = {
    Reinforcement Learning (RL) is a paradigm for enabling autonomous
    learning wherein rewards are used to influence an
    agent's action choices in various states. As the number of states
    and actions available to an agent increases, so it becomes increasingly
    difficult for the agent to quickly learn the optimal
    action for any given state. One approach to mitigating the detrimental
    effects of large state spaces is to represent collections
    of states together as encompassing "abstract states".
    State abstraction itself leads to a host of new challenges for an
    agent. One such challenge is that of automatically identifying
    new abstractions that balance generality and specificity; the
    agent must identify both the similarities and the differences
    between states that are relevant to its goals, while ignoring
    unnecessary details that would otherwise hinder the agent's
    progress. We call this problem of identifying useful abstract
    states the Abstraction Synthesis Problem (ASP).
    State abstractions can provide a significant benefit to modelbased
    agents by simplifying their models. T-UCT, a hierarchical
    model-learning algorithm for discrete, factored domains,
    is one such method that leverages state abstractions to quickly
    learn and control an agent's environment. Such abstractions
    play a pivotal role in the success of T-UCT; however, T-UCT's
    solution to ASP requires a fully discrete state space.
    In this work we develop and compare enhancements to T-UCT
    that relax its assumption of discreteness. We focus on solving
    ASP in domains with multidimensional, continuous state factors,
    using only the T-UCT agent's limited experience histories
    and minimal knowledge of the domain's structure. Finally, we
    present a new abstraction synthesis algorithm, RCAST, and
    compare this algorithm to existing approaches in the literature.
    We provide the algorithmic details of RCAST and its
    subroutines, and we show that RCAST outperforms earlier
    approaches to ASP by enabling T-UCT to accumulate significantly
    greater total reward with minimal expert configuration
    and processing time.
  },
	url = {https://www.aaai.org/ocs/index.php/SSS/SSS18/paper/view/17576},
}

Generated by bib2html.pl (written by Patrick Riley ) on Mon Aug 19, 2019 13:01:03