Peter Stone's Selected Publications

• Classified by Topic • Classified by Publication Type • Sorted by Date • Sorted by First Author Last Name • Classified by Funding Source •

Building Self-Play Curricula Online by Playing with Expert Agents in Adversarial Games

Building Self-Play Curricula Online by Playing with Expert Agents in Adversarial Games.
Felipe Leno Da Silva, Anna Helena Reali Costa, and Peter Stone.
In Proceedings of the 8th Brazilian Conference on Intelligent Systems (BRACIS), October 2019.

Download

[PDF]1.6MB

Abstract

Multiagent reinforcement learning algorithms are designed to enable an autonomous agent to adapt to an opponent's strategy based on experience. However, most such algorithms require a relatively large amount of experience to perform well. This requirement is problematic when opponent interactions are expensive, for example, when the agent has limited access to the opponent during training. In order to make good use of the opponent as a resource to support learning, we propose SElf-PLay by Expert Modeling (SEPLEM), an algorithm that models the opponent policy in a few episodes, and uses it to train in a simulated environment where it is cheaper to perform learning steps than in the real environment. Our empirical evaluation indicates that SEPLEM, by iteratively building a Curriculum of simulated tasks, achieves better performance than both only playing against the expert and using pure Self-Play techniques. SEPLEM is a promising technique to accelerate learning in multiagent adversarial tasks.

BibTeX Entry

@InProceedings{BRACIS2019-Leno,
  author={Felipe Leno Da Silva and Anna Helena Reali Costa and Peter Stone},
  title={Building Self-Play Curricula Online by Playing with Expert Agents in Adversarial Games},
  booktitle={Proceedings of the 8th Brazilian Conference on Intelligent Systems (BRACIS)},
  location={Salvador, Bahia, Brazil},
  year={2019},
  abstract={Multiagent reinforcement learning algorithms are designed to enable an autonomous
   agent to adapt to an opponent's strategy based on experience.  However, most 
   such algorithms require a relatively large amount of experience to perform well.
    This requirement is problematic when opponent interactions are expensive, for 
    example, when the agent has limited access to the opponent during training. 
    In order to make good use of the opponent as a resource to support learning, 
    we propose SElf-PLay by Expert Modeling (SEPLEM), an algorithm that models the
     opponent policy in a few episodes, and uses it to train in a simulated 
     environment where it is cheaper to perform learning steps than in the real 
     environment. Our empirical evaluation indicates that SEPLEM, by iteratively 
     building a Curriculum of simulated tasks, achieves better performance than 
     both only playing against the expert and using pure Self-Play techniques. 
     SEPLEM is a promising technique to accelerate learning in multiagent 
     adversarial tasks. 
},
month = {October}
}

Generated by bib2html.pl (written by Patrick Riley ) on Fri Feb 13, 2026 09:54:51