There’s an (albeit cliché) saying that says that two heads are better than one. Unsurprisingly, this idiom extends to artificial agents. In the field of AI, researchers have been working to understand how to make independent agents, who may have different goals, work together in an environment to complete a shared task. A group of researchers in Texas Computer Science (TXCS) comprising Ishan Durugkar, Elad Liebman, and TXCS professor Peter Stone have been working to solve this problem. In “Balancing Individual Preferences and Shared Objectives in Multiagent Reinforcement Learning,” which is included in the proceedings for the 29th International Joint Conference on Artificial Intelligence (IJCAI) in July, the research team explores frameworks in which preferences are balanced between agents to allow for more efficient task completion.

Artificial agents, much like humans, may need to work together to solve a problem. Unlike humans, artificial agents don’t have an innate ability to reason or cooperate. Teams composed of researchers, scientists, and engineers must develop methods of programming an artificial agent’s ability to understand the goals of both itself and others so they can become better team members. While agents assigned to a task may have the same end-goal, such as organizing a room, the tasks that they are specifically programmed for and their methods of reaching the goal could vary drastically. This could lead to a group of agents that work against rather than with each other. To make the process go as efficiently as possible, all agents must understand and work within the parameters of the other agent’s goals.

Durugkar illustrated the issue in terms of a band. “Consider a group of musicians,” he said. “Each of them might have a preference on which type of song they would like to perform, but ultimately they want to entertain their audience.” That’s where the research team’s work steps in: they examine “how to enable agents to cooperate in such a scenario by balancing their preferences with the shared task.”

The team taught the artificial agents by “using the paradigm of reinforcement learning.” In a scenario where agents have a project where each agent may have a preference on how to complete the task, the research team studied the behavior of these agents “with varying degrees of selfishness when they tried to collaborate on a task.” Selfishness, in this scenario meaning an artificial agent’s desire to follow their individual preference rather than acquiescing to the preferences of the other agents. Unexpectedly, they found that in these scenarios, agents “being a little selfish and attempting to satisfy their individual preferences while solving the shared task actually leads to faster learning and coordination.” In addition, they validated through experiments a way to “find how much each agent should focus on its own preference as opposed to the shared task.”

Read more about “Balancing Individual Preferences and Shared Objectives in Multiagent Reinforcement Learning” on Peter Stone's selected publications page.