RPE: Algorithms for Optimal and Exploitive Play in Incomplete-Information Games

Room: GDC 3.516
Contact Name: 
Wesley Tansey
May 15, 2013 11:00am - 12:00pm



Discovering strong strategies under uncertainty is a challenging task for machine learning algorithms acting in multi-agent environments. Incomplete-information games, such as poker, provide a useful testing ground for comparing such algorithms against different forms of opponents. Generally, research into poker has been divided into learning one of two types of strategies: optimal or exploitive. Optimal agents operate under the assumption that their opponents will act rationally, and thus optimal agents seek to minimize their worst-case expected losses by computing and playing a Nash equilibrium strategy. Conversely, exploitive agents assume their opponents' strategies may be flawed, thus exploitive agents seek to maximize their expected gains against their opponent, at the risk of opening themselves up to exploitation due to model error.

In this talk, I'll cover algorithms for creating optimal and exploitive agents. I will give a general introduction to game theory and provide a broad overview of a variety of techniques in poker. I'll discuss in-depth the current state-of-the-art techniques, namely Counterfactual Regret (CFR) minimization and online implicit opponent modeling. I'll also present my own preliminary research results on improving sample efficiency in opponent modeling and highlight some research opportunities I plan to pursue.