AI-Lab Demos - The Role of Reward Structure, Coordination Mechanism and Net Return in the Evolution of Cooperation

The Role of Reward Structure, Coordination Mechanism and Net Return in the Evolution of Cooperation (2011)

Author: Padmini Rajagopalan, Aditya Rawal

The following videos show the effect of three different factors on the evolution of cooperation in a team of predators hunting prey: reward structure, coordination mechanism and net return.

Three predators were coevolved using the Multi-Component ESP architecture, where each predator agent consisted of multiple neural networks to sense the other agents on the field. The outputs of these neural networks were then given to a combiner network that decided the next move of the agent. The weights of all these networks were evolved. The fitness of the predator was then distributed equally among all component neural networks. The goal of the predators was to capture as many prey as possible within the simulation time limit. The reward from prey capture was either shared between all three predators or given only to the predator that caught the prey. Whether the predators could see one another (direct communication) or not (stigmergic coordination) could also be varied.

The prey in these experiments were scripted and had the fixed behavior of moving directly away from the nearest predator. There were two kinds of prey in the experiments: zebras and gazelles. The zebras are as fast as the predators and thus are difficult to catch. But they give more reward on capture. The gazelles are slower than the predators, so a single predator can catch a gazelle without any help. But the reward on gazelle capture is low.

A toroidal grid world was used to evaluate the predators. Each predator or prey agent can move in four directions (east, west, north, south), and all the agents in the simulation make one move simultaneously at every time step. In the first four experiments, the only prey are four zebras, so no single predator can catch them on its own. The predators need to surround a zebra from different directions before catching it. Thus, if a prey is captured in the first four experiments, it is considered a cooperative move by the predators.

In the videos below, the colored cubes are predators, the black-and-white spheres are zebras and the brown spheres are gazelles.

Experiment 1: Individual Fitness, Stigmergic Coordination

When the predators neither communicated nor shared fitness, they initially did not evolve to cooperate to catch the prey, and the prey easily eluded any individual predators.

Experiment 2: Shared Rewards, Stigmergic Coordination

When prey-capture rewards are shared, the predators have a direct incentive to collaborate, and this leads to their quickly evolving specific roles to cooperate to catch the prey.

Experiments 3 and 4: Individual/Shared Fitness, Direct Communication

Communicating predators have more flexible behaviors, i.e. they can change roles in the middle of the hunt. In this video, the green predator sometimes acts as a blocker and sometimes as an attacker.

In the following two experiments (5 and 6), there are two kinds of prey simultaneously on the field: one zebra and four gazelles. The difficulty of capture and reward gained from capture are different for the zebra and the gazelles. While the zebra requires cooperation of the predators to catch, it gives a reward of either 150 or 450 to all three predators on capture. The gazelles can be caught by a single predator on its own and only that predator would get a reward of 100 for catching it.

Experiment 5: Zebra capture gives reward of 150

With two different types of prey (zebras and gazelles), whether cooperation evolves or not depends on the value of the prey relative to the difficulty of catching it. When the zebra reward was not much higher than the gazelle reward, the predators did not evolve cooperation, preferring to catch gazelles on their own.

Experiment 6: Zebra capture gives reward of 450

When the reward for catching the zebra is much higher, the predators evolve to cooperate to catch it first. Once the zebra is caught, the predators return to hunting the gazelles individually.

People

Kay E. Holekamp	Formerly affiliated Collaborator	holekamp [at] msu edu
Padmini Rajagopalan	Postdoctoral Alumni	padminir [at] utexas edu
Aditya Rawal	Ph.D. Alumni	aditya [at] cs utexas edu

Projects

Coevolution of Competitive and Cooperative Agent Behavior	2009 - Present
Learning Strategic Behavior in Sequential Decision Tasks	2009 - 2014
The Role of Emotion and Communication in Cooperative Behavior	2013 - 2016

Publications

Neuroevolution Insights Into Biological Neural Computation	2025
Risto Miikkulainen, Science, Vol. 387 (2025), pp. eadp7478.
Neuroevolution: Harnessing Creativity in AI Model Design	2025
Sebastian Risi, David Ha, Yujin Tang, Risto Miikkulainen, To Appear In , MIT Press, Cambridge, MA 2025. MIT Press.
IJCNN-2013 Tutorial on Evolution of Neural Networks	2013
Risto Miikkulainen, To Appear In unpublished. Tutorial slides..
The Role of Reward Structure, Coordination Mechanism and Net Return in the Evolution of Cooperation	2011
Padmini Rajagopalan, Aditya Rawal, Risto Miikkulainen, Marc A. Wiseman and Kay E. Holekamp, In Proceedings of the IEEE Conference on Computational Intelligence and Games (CIG 2011), Seoul, South Korea 2011.
Coevolution of Role-Based Cooperation in Multi-Agent Systems	2010
Chern Han Yong and Risto Miikkulainen, IEEE Transactions on Autonomous Mental Development, Vol. 1 (2010), pp. 170--186.

Related Areas

Artificial Life

Labs

Neural Networks