Shivaram's Reading List

Function Approximation	Partial Observability	Learning Methods	Ensembles
Stochastic Optimisation	General RL	General ML	Multiagent Learning
Comparison/Integration	Bandits	Applications	Robot Soccer
Humanoids	Parameter	MDP	Empirical
Failure Warning	Representation	General AI	Neural Networks
All

Bandits

Almost Optimal Exploration in Multi-Armed Bandits
Zohar Karnin, Tomer Koren, and Oren Somekh, 2013
Details

Information Complexity in Bandit Subset Selection
Emilie Kaufmann and Shivaram Kalyanakrishnan, 2013
Details

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence
Victor Gabillon, Mohammad Ghavamzadeh, and Alessandro Lazaric, 2012
Details

Planning in Reward-Rich Domains via PAC Bandits
Sergiu Goschin, Ari Weinstein, Michael L. Littman, and Erick Chastain, 2012
Details

Best Arm Identification in Multi-Armed Bandits
Jean-Yves Audibert, Sébastien Bubeck, and Rémi Munos, 2010
Details

UCB REVISITED: IMPROVED REGRET BOUNDS FOR THE STOCHASTIC MULTI-ARMED BANDIT PROBLEM
Peter Auer and Ronald Ortner, 2010
Details

Simulation optimization using the cross-entropy method with optimal computing budget allocation
Donghai He, Loo Hay Lee, Chun-Hung Chen, Michael C. Fu, and Segev Wasserkrug, 2010
Details

An Asymptotically Optimal Bandit Algorithm for Bounded Support Models
Junya Honda and Akimichi Takemura, 2010
Details

Non-Stochastic Bandit Slate Problems
Satyen Kale, Lev Reyzin, and Robert E. Schapire, 2010
Details

Efficient Selection of Multiple Bandit Arms: Theory and Practice
Shivaram Kalyanakrishnan and Peter Stone, 2010
Details

Regret bounds for sleeping experts and bandits
Robert Kleinberg, Alexandru Niculescu-Mizil, and Yogeshwer Sharma, 2010
Details

A contextual-bandit approach to personalized news article recommendation
Lihong Li, Wei Chu, John Langford, and Robert E. Schapire, 2010
Details

$epsilon$-First Policies for Budget-Limited Multi-Armed Bandits
Long Tran-Thanh, Archie Chapman, Enrique Munoz de Cote, Alex Rogers, and Nicholas R. Jennings, 2010
Details

Exploration-exploitation tradeoff using variance estimates in multi-armed bandits
Jean-Yves Audibert, Rémi Munos, and Csaba Szepesvári, 2009
Details

Pure Exploration in Multi-armed Bandits Problems
Sébastien Bubeck, Rémi Munos, and Gilles Stoltz, 2009
Details

Combinatorial Bandits
Nicolò Cesa-Bianchi and Gábor Lugosi, 2009
Details

Efficient Simulation Budget Allocation for Selecting an Optimal Subset
Chun-Hung Chen, Donghai He, Michael Fu, and Loo Hay Lee, 2008
Details

Multi-armed bandits in metric spaces
Robert Kleinberg, Aleksandrs Slivkins, and Eli Upfal, 2008
Details

Empirical Bernstein stopping
Volodymyr Mnih, Csaba Szepesvári, and Jean-Yves Audibert, 2008
Details

Tuning Bandit Algorithms in Stochastic Environments
Jean-Yves Audibert, Rémi Munos, and Csaba Szepesvári, 2007
Details

Approximation Algorithms for Budgeted Learning Problems
Sudipto Guha and Kamesh Munagala, 2007
Details

Recent advances in ranking and selection
Seong-Hee Kim and Barry L. Nelson, 2007
Details

Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems
Eyal Even-Dar, Shie Mannor, and Yishay Mansour, 2006
Details

Bandit Based Monte-Carlo Planning
Levente Kocsis and Csaba Szepesvári, 2006
Details

Active Model Selection
Omid Madani, Daniel J. Lizotte, and Russell Greiner, 2004
Details

The Sample Complexity of Exploration in the Multi-Armed Bandit Problem
Shie Mannor and John N. Tsitsiklis, 2004
Details

Using Ranking and Selection to “Clean Up“ after Simulation Optimization
Justin Boesel, Barry L. Nelson, and Seong-Hee Kim, 2003
Details

Lower Bounds on the Sample Complexity of Exploration in the Multi-armed Bandit Problem
Shie Mannor and John N. Tsitsiklis, 2003
Details

Finite-time Analysis of the Multiarmed Bandit Problem
Peter Auer, Nicolò Cesa-Bianchi, and Paul Fischer, 2002
Details

The Nonstochastic Multiarmed Bandit Problem
Peter Auer, Nicolò Cesa-Bianchi, Yoav Freund, and Robert E. Schapire, 2002
Details

PAC Bounds for Multi-armed Bandit and Markov Decision Processes
Eyal Even-Dar, Shie Mannor, and Yishay Mansour, 2002
Details

Multiple Decision Procedures: Theory and Methodology of Selecting and Ranking Populations
Shanti S. Gupta and S. Panchapakesan, 2002
Details

Mining complex models from arbitrarily large databases in constant time
Geoff Hulten and Pedro Domingos, 2002
Details

A fully sequential procedure for indifference-zone selection in simulation
Seong-Hee Kim and Barry L. Nelson, 2001
Details

Mining high-speed data streams
Pedro Domingos and Geoff Hulten, 2000
Details

Selecting and Ordering Populations: A New Statistical Methodology
Jean Dickinson Gibbons, Ingram Olkin, and Milton Sobel, 1999
Details

An empirical evaluation of several methods to select the best system
Koichiro Inoue, Stephen E. Chick, and Chun-Hung Chen, 1999
Details

Design and analysis of experiments for statistical selection, screening, and multiple comparisons
Robert E. Bechhofer, Thomas J. Santner, and David M. Goldsman, 1995
Details

Sequential PAC Learning
Dale Schuurmans and Russell Greiner, 1995
Details

Restricted Subset Selection Procedures for Simulation
David W. Sullivan and James R. Wilson, 1989
Details

Bandit problems
Donald A. Berry and Bert Fristedt, 1985
Details

A procedure for selecting a subset of size $m$ containing the $l$ best of $k$ independent normal populations, with applications to simulation
Lloyd W. Koenig and Averill M. Law, 1985
Details

Asymptotically Efficient Adaptive Allocation Rules
T. L. Lai and Herbert Robbins, 1985
Details

Determining Sample Size for Pretesting Comparative Effectiveness of Advertising Copies
Siddhartha R. Dalal and V. Srinivasan, 1977
Details

Sequential models for clinical trials
Herman Chernoff, 1967
Details

A Sequential Procedure for Selecting the Population with the Largest Mean from $k$ Normal Populations
Edward Paulson, 1964
Details

Probability Inequalities for Sums of Bounded Random Variables
Wassily Hoeffding, 1963
Details

Comparing entries in random sample tests
W. A. Becker, 1961
Details

A Sequential Multiple-Decision Procedure for Selecting the Best One of Several Normal Populations with a Common Unknown Variance, and Its Use with Various Experimental Designs
Robert E. Bechhofer, 1958
Details

Some aspects of the sequential design of experiments
Herbert Robbins, 1952
Details

Sequential Analysis
Abraham Wald, 1947
Details

Contributions to the Theory of Sequential Analysis. I
M. A. Girshick, 1946
Details

Contributions to the Theory of Sequential Analysis, II, III
M. A. Girshick, 1946
Details