Peter Stone's Selected Publications

Classified by TopicClassified by Publication TypeSorted by DateSorted by First Author Last NameClassified by Funding Source


VGC-Bench: Towards Mastering Diverse Team Strategies in Competitive Pokémon

VGC-Bench: Towards Mastering Diverse Team Strategies in Competitive Pokémon.
Cameron Angliss, Jiaxun Cui, Jiaheng Hu, Arrasy Rahman, and Peter Stone.
In International Conference on Autonomous Agents and Multiagent Systems, May 2026.

Download

[PDF]2.0MB  

Abstract

Developing AI agents that can robustly adapt to varying strategic landscapeswithout retraining is a central challenge in multi‑agent learning. Pokémon VideoGame Championships (VGC) is a domain with a vast space of approximately 10^139team configurations, far larger than those of other games such as Chess, Go,Poker, StarCraft, or Dota. The combinatorial nature of team building in PokémonVGC causes optimal strategies to vary substantially depending on both thecontrolled team and the opponent's team, making generalization uniquelychallenging. To advance research on this problem, we introduce VGC-Bench: abenchmark that provides critical infrastructure, standardizes evaluationprotocols, and supplies a human-play dataset of over 700,000 battle logs and arange of baseline agents based on heuristics, large language models, behaviorcloning, and multi-agent reinforcement learning with empirical game-theoreticmethods such as self-play, fictitious play, and double oracle. In the restrictedsetting where an agent is trained and evaluated in a mirror match with a singleteam configuration, our methods can win against a professional VGC competitor. Werepeat this training and evaluation with progressively larger team sets and findthat as the number of teams increases, the best-performing algorithm in thesingle-team setting has worse performance and is more exploitable, but hasimproved generalization to unseen teams. Our code and dataset are open-sourced athttps://github.com/cameronangliss/vgc-bench andhttps://huggingface.co/datasets/cameronangliss/vgc-battle-logs.

BibTeX Entry

@InProceedings{angliss2026vgc,
  author   = {Cameron Angliss and Jiaxun Cui and Jiaheng Hu and Arrasy Rahman and Peter Stone},
  title    = {VGC-Bench: Towards Mastering Diverse Team Strategies in Competitive Pokémon},
  booktitle = {International Conference on Autonomous Agents and Multiagent Systems},
  year     = {2026},
  month    = {May},
  location = {Paphos, Cyprus},
  abstract = {Developing AI agents that can robustly adapt to varying strategic landscapes
without retraining is a central challenge in multi‑agent learning. Pokémon Video
Game Championships (VGC) is a domain with a vast space of approximately 10^139
team configurations, far larger than those of other games such as Chess, Go,
Poker, StarCraft, or Dota. The combinatorial nature of team building in Pokémon
VGC causes optimal strategies to vary substantially depending on both the
controlled team and the opponent's team, making generalization uniquely
challenging. To advance research on this problem, we introduce VGC-Bench: a
benchmark that provides critical infrastructure, standardizes evaluation
protocols, and supplies a human-play dataset of over 700,000 battle logs and a
range of baseline agents based on heuristics, large language models, behavior
cloning, and multi-agent reinforcement learning with empirical game-theoretic
methods such as self-play, fictitious play, and double oracle. In the restricted
setting where an agent is trained and evaluated in a mirror match with a single
team configuration, our methods can win against a professional VGC competitor. We
repeat this training and evaluation with progressively larger team sets and find
that as the number of teams increases, the best-performing algorithm in the
single-team setting has worse performance and is more exploitable, but has
improved generalization to unseen teams. Our code and dataset are open-sourced at
https://github.com/cameronangliss/vgc-bench and
https://huggingface.co/datasets/cameronangliss/vgc-battle-logs.
  },
}

Generated by bib2html.pl (written by Patrick Riley ) on Tue Feb 03, 2026 18:01:36