UTCS Artificial Intelligence
courses
talks/events
demos
people
projects
publications
software/data
labs
areas
admin
On the Analysis of Complex Backup Strategies in Monte Carlo Tree Search (2016)
Khandelwal, Piyush, Liebman, Elad, Niekum, Scott, Stone, and Peter
Over the past decade, Monte Carlo Tree Search (MCTS) and specifically Upper Confidence Bound in Trees (UCT) have proven to be quite effective in large probabilistic planning domains. In this paper, we focus on how values are backpropagated in the MCTS tree, and apply complex return strategies from the Reinforcement Learning (RL) literature to MCTS, producing 4 new MCTS variants. We demonstrate that in some probabilistic planning benchmarks from the International Planning Competition (IPC), select- ing a MCTS variant with a backup strategy different from Monte Carlo averaging can lead to substantially better results. We also propose a hypothesis for why different backup strategies lead to different performance in particular environments, and manipulate a carefully structured grid-world domain to provide empirical evidence supporting our hypothesis.
View:
PDF
,
HTML
Citation:
In
Proceedings of The 33rd International Conference on Machine Learning
, pp. 1319--1328, New York City, NY, USA, June 2016.
Bibtex:
@inproceedings{ICML16-khandelwal, title={On the Analysis of Complex Backup Strategies in Monte Carlo Tree Search}, author={Khandelwal, Piyush and Liebman and Elad and Niekum and Scott and Stone and Peter}, booktitle={Proceedings of The 33rd International Conference on Machine Learning}, month={June}, address={New York City, NY, USA}, pages={1319--1328}, url="http://www.cs.utexas.edu/users/ai-lab?khandelwal:icml16", year={2016} }
Presentation:
Slides (PDF)
People
Piyush Khandelwal
Ph.D. Alumni
piyushk [at] cs utexas edu
Elad Liebman
Ph.D. Student
eladlieb [at] cs utexas edu
Peter Stone
Faculty
pstone [at] cs utexas edu
Areas of Interest
Markov Decision Processes
Labs
Learning Agents