Peter Stone's Selected Publications

• Classified by Topic • Classified by Publication Type • Sorted by Date • Sorted by First Author Last Name • Classified by Funding Source •

Critical Factors in the Empirical Performance of Temporal Difference and Evolutionary Methods for Reinforcement Learning

Critical Factors in the Empirical Performance of Temporal Difference and Evolutionary Methods for Reinforcement Learning.
Shimon Whiteson, Matthew E. Taylor, and Peter Stone.
Journal of Autonomous Agents and Multi-Agent Systems, 21(1):1–27, 2010.

Download

[PDF]760.6kB [postscript]1.9MB

Abstract

Temporal difference and evolutionary methods are two of the most common approaches to solving reinforcement learning problems. However, there is little consensus on their relative merits and there have been few empirical studies that directly compare their performance. This article aims to address this shortcoming by presenting results of empirical comparisons between Sarsa and NEAT, two representative methods, in mountain car and keepaway, two benchmark reinforcement learning tasks. In each task, the methods are evaluated in combination with both linear and nonlinear representations to determine their best configurations. In addition, this article tests two specific hypotheses about the critical factors contributing to these methods' relative performance: 1) that sensor noise reduces the final performance of Sarsa more than that of NEAT, because Sarsa's learning updates are not reliable in the absence of the Markov property and 2) that stochasticity, by introducing noise in fitness estimates, reduces the learning speed of NEAT more than that of Sarsa. Experiments in variations of mountain car and keepaway designed to isolate these factors confirm both these hypotheses.

BibTeX Entry

@Article{JAAMAS09-Whiteson,
	Author="Shimon Whiteson and Matthew E.\ Taylor and Peter Stone",
	title="Critical Factors in the Empirical Performance of Temporal Difference and Evolutionary Methods for Reinforcement Learning",
        journal="Journal of Autonomous Agents and Multi-Agent Systems",
	volume="21",number="1",pages="1--27",
	year="2010",
	abstract="Temporal difference and evolutionary methods are two
        of the most common approaches to solving reinforcement
        learning problems. However, there is little consensus on their
        relative merits and there have been few empirical studies that
        directly compare their performance. This article aims to
        address this shortcoming by presenting results of empirical
        comparisons between Sarsa and NEAT, two representative
        methods, in mountain car and keepaway, two benchmark
        reinforcement learning tasks. In each task, the methods are
        evaluated in combination with both linear and nonlinear
        representations to determine their best configurations. In
        addition, this article tests two specific hypotheses about the
        critical factors contributing to these methods' relative
        performance: 1) that sensor noise reduces the final
        performance of Sarsa more than that of NEAT, because Sarsa's
        learning updates are not reliable in the absence of the Markov
        property and 2) that stochasticity, by introducing noise in
        fitness estimates, reduces the learning speed of NEAT more
        than that of Sarsa.  Experiments in variations of mountain car
        and keepaway designed to isolate these factors confirm both
        these hypotheses.",
}

Generated by bib2html.pl (written by Patrick Riley ) on Mon Feb 23, 2026 19:28:55