# Peter Stone's Selected Publications

•
Classified by Topic •
Classified by Publication Type •
Sorted by Date •
Sorted by First Author Last Name •
Classified by Funding Source •

## Bayesian Models of Nonstationary Markov Decision Problems

Nicholas K. Jong and Peter
Stone. **Bayesian Models of Nonstationary Markov Decision Problems**. In *IJCAI 2005 workshop on Planning and Learning
in A Priori Unknown or Dynamic Domains*, August 2005.

Workshop
webpage.

### Download

[PDF]42.4kB

### Abstract

Standard reinforcement learning algorithms gener- ate polices that optimize expected
future rewards in a priori unknown domains, but they assume that the domain does not change
over time. Prior work cast the reinforcement learning problem as a Bayesian estimation
problem, using experience data to condition a probability distribution over domains. In
this paper we propose an elaboration of the typical Bayesian model that accounts for the
possibility that some aspect of the domain changes spontaneously during learning. We develop
a reinforcement learning algorithm based on this model that we expect to react more intelligently
to sudden changes in the behavior of the environment.

### BibTeX Entry

@inproceedings(IJCAI05ws,
author="Nicholas K.\ Jong and Peter Stone",
title="Bayesian Models of Nonstationary Markov Decision Problems",
booktitle="{IJCAI} 2005 workshop on Planning and Learning in A Priori Unknown or Dynamic Domains",
month="August",year="2005",
abstract={
Standard reinforcement learning algorithms gener-
ate polices that optimize expected future rewards
in a priori unknown domains, but they assume that
the domain does not change over time. Prior work
cast the reinforcement learning problem as a
Bayesian estimation problem, using experience data
to condition a probability distribution over
domains. In this paper we propose an elaboration of
the typical Bayesian model that accounts for the
possibility that some aspect of the domain
changes spontaneously during learning. We develop
a reinforcement learning algorithm based on this
model that we expect to react more intelligently to
sudden changes in the behavior of the environment.
},
wwwnote={<a href="http://www-rcf.usc.edu/~skoenig/workshop.html">Workshop webpage</a>.},
)

Generated by
bib2html.pl
(written by Patrick Riley
) on
Mon Jan 22, 2018 09:50:05