• Classified by Topic • Classified by Publication Type • Sorted by Date • Sorted by First Author Last Name • Classified by Funding Source •
On-Policy vs. Off-Policy Updates for Deep Reinforcement Learning.
Matthew
Hausknecht and Peter Stone.
In Deep Reinforcement Learning: Frontiers
and Challenges, IJCAI Workshop, July 2016.
Temporal-difference-based deep-reinforcement learning methods have typically been driven by off-policy, bootstrap Q-Learning updates. In this paper, we investigate the effects of using on-policy, Monte Carlo updates. Our empirical results show that for the DDPG algorithm in a continuous action space, mixing on-policy and off-policy update targets exhibits superior performance and stability compared to using exclusively one or the other. The same technique applied to DQN in a discrete action space drastically slows down learning. Our findings raise questions about the nature of on-policy and off-policy bootstrap and Monte Carlo updates and their relationship to deep reinforcement learning methods.
@InProceedings{DeepRL16-hausknecht,
author = {Matthew Hausknecht and Peter Stone},
title = {On-Policy vs. Off-Policy Updates for Deep Reinforcement Learning},
booktitle = {Deep Reinforcement Learning: Frontiers and Challenges, IJCAI Workshop},
location = {New York},
month = {July},
year = {2016},
abstract = {Temporal-difference-based deep-reinforcement learning methods have typically been driven by off-policy, bootstrap Q-Learning updates. In this paper, we investigate the effects of using on-policy, Monte Carlo updates. Our empirical results show that for the DDPG algorithm in a continuous action space, mixing on-policy and off-policy update targets exhibits superior performance and stability compared to using exclusively one or the other. The same technique applied to DQN in a discrete action space drastically slows down learning. Our findings raise questions about the nature of on-policy and off-policy bootstrap and Monte Carlo updates and their relationship to deep reinforcement learning methods.},
}
Generated by bib2html.pl (written by Patrick Riley ) on Wed Jun 10, 2026 15:26:48