• Classified by Topic • Classified by Publication Type • Sorted by Date • Sorted by First Author Last Name • Classified by Funding Source •
Nicholas K. Jong and Peter Stone. Towards Learning to Ignore Irrelevant State Variables. In The AAAI-2004 Workshop on Learning and Planning in Markov Processes -- Advances and Challenges, July 2004.
[PDF]98.8kB [postscript]468.0kB
Hierarchical methods have attracted much recent attention as a means for scaling reinforcement learning algorithms to increasingly complex, real-world tasks. These methods provide two important kinds of abstraction that facilitate learning. First, hierarchies organize actions into temporally abstract high-level tasks. Second, they facilitate task dependent state abstractions that allow each high-level task to restrict attention only to relevant state variables. In most approaches to date, the user must supply suitable task decompositions and state abstractions to the learner. How to discover these hierarchies automatically remains a challenging open problem. As a first step towards solving this problem, we introduce a general method for determining the validity of potential state abstractions that might form the basis of reusable tasks. We build a probabilistic model of the underlying Markov decision problem and then statistically test the applicability of the state abstraction. We demonstrate the ability of our procedure to discriminate among safe and unsafe state abstractions in the familiar Taxi domain.
@InProceedings(AAAI04ws-nick, author="Nicholas K.\ Jong and Peter Stone", title="Towards Learning to Ignore Irrelevant State Variables", booktitle="The {AAAI}-2004 Workshop on Learning and Planning in Markov Processes -- Advances and Challenges", month="July",year="2004", abstract={ Hierarchical methods have attracted much recent attention as a means for scaling reinforcement learning algorithms to increasingly complex, real-world tasks. These methods provide two important kinds of abstraction that facilitate learning. First, hierarchies organize actions into temporally abstract high-level tasks. Second, they facilitate task dependent state abstractions that allow each high-level task to restrict attention only to relevant state variables. In most approaches to date, the user must supply suitable task decompositions and state abstractions to the learner. How to discover these hierarchies automatically remains a challenging open problem. As a first step towards solving this problem, we introduce a general method for determining the validity of potential state abstractions that might form the basis of reusable tasks. We build a probabilistic model of the underlying Markov decision problem and then statistically test the applicability of the state abstraction. We demonstrate the ability of our procedure to discriminate among safe and unsafe state abstractions in the familiar Taxi domain. }, bit2html_ignore=1 )
Generated by bib2html.pl (written by Patrick Riley ) on Thu Oct 05, 2006 11:07:45