Structure Learning in Ergodic Factored MDPs without Knowledge of the Transition Function's In-Degree (2011)
This paper introduces Learn Structure and Exploit RMax (LSE-RMax), a novel model based structure learning algorithm for ergodic factored-state MDPs. Given a planning horizon that satisfies a condition, LSE-RMax provably guarantees a return very close to the optimal return, with a high certainty, without requiring any prior knowledge of the in-degree of the transition function as input. LSE-RMax is fully implemented with a thorough analysis of its sample complexity. We also present empirical results demonstrating its effectiveness compared to prior approaches to the problem.
In Proceedings of the Twenty Eighth International Conference on Machine Learning (ICML'11), June 2011.

Doran Chakraborty Ph.D. Alumni chakrado [at] cs utexas edu
Peter Stone Faculty pstone [at] cs utexas edu