Ad Hoc Teamwork Modeled with Multi-armed Bandits: An Extension to Discounted Infinite Rewards

Ad Hoc Teamwork Modeled with Multi-armed Bandits: An Extension to Discounted Infinite Rewards (2011)

Before deployment, agents designed for multiagent team settings are commonly developed together or are given standardized communication and coordination protocols. However, in many cases this pre-coordination is not possible because the agents do not know what agents they will encounter, resulting in ad hoc team settings. In these problems, the agents must learn to adapt and cooperate with each other on the fly. We extend existing research on ad hoc teams, providing theoretical results for handling cooperative multi-armed bandit problems with infinite discounted rewards.

View:

PDF, PS, HTML

Citation:

In Tenth International Conference on Autonomous Agents and Multiagent Systems - Adaptive Learning Agents Workshop (AAMAS - ALA), May 2011.

Bibtex:

People

Samuel Barrett	Ph.D. Alumni	sbarrett [at] cs utexas edu
Peter Stone	Faculty	pstone [at] cs utexas edu

Areas of Interest

Ad Hoc Teamwork Multiagent Systems

Labs

Learning Agents