UTCS Artificial Intelligence
courses
talks/events
demos
people
projects
publications
software/data
labs
areas
admin
Real Time Targeted Exploration in Large Domains (2010)
Todd Hester
and
Peter Stone
A developing agent needs to explore to learn about the world and learn good behaviors. In many real world tasks, this exploration can take far too long, and the agent must make decisions about which states to explore, and which states not to explore. Bayesian methods attempt to address this problem, but take too much computation time to run in reasonably sized domains. In this paper, we present TEXPLORE, the first algorithm to perform targeted exploration in real time in large domains. The algorithm learns multiple possible models of the domain that generalize action effects across states. We experiment with possible ways of adding intrinsic motivation to the agent to drive exploration. TEXPLORE is fully implemented and tested in a novel domain called Fuel World that is designed to reflect the type of targeted exploration needed in the real world. We show that our algorithm significantly outperforms representative examples of both model-free and model-based RL algorithms from the literature and is able to quickly learn to perform well in a large world in real-time.
View:
PDF
,
PS
,
HTML
Citation:
In
Proceedings of the Ninth International Conference on Development and Learning (ICDL 2010)
, 2010 (Eds.), August 2010.
Bibtex:
@InProceedings{hester:icdl10, title={Real Time Targeted Exploration in Large Domains}, author={Todd Hester and Peter Stone}, booktitle={Proceedings of the Ninth International Conference on Development and Learning (ICDL 2010)}, month={August}, editor={2010}, url="http://www.cs.utexas.edu/users/ai-lab?hester:icdl10", year={2010} }
People
Todd Hester
Postdoctoral Alumni
todd [at] cs utexas edu
Peter Stone
Faculty
pstone [at] cs utexas edu
Projects
TEXPLORE: Real-Time Sample Efficient Reinforcement Learning
2009 - Present
Areas of Interest
Machine Learning
Reinforcement Learning
Demos
TEXPLORE: Real-Time Sample Efficient Reinforcement Learning
Todd Hester
2012
Labs
Learning Agents