2011 ALIHT Accepted Papers (to be presented at the workshop):

Katrien Beuls and Joris Bleys. Game-based Language Tutoring.  In 2011 IJCAI Workshop on Agents Learning Interactively from Human Teachers (ALIHT), July 2011. PDF

Abstract: Since computational agents tend to be used more and more outside research labs, a serious effort should be put into their ability to learn new skills to adapt to changing environments. These skills also include language processing and language learning. How do we teach natural language to agents so that their linguistic knowledge is robust enough to deal with new situations? We present a case study in which a software agent needs to acquire the English colour prototypes and their corresponding terms, purely based on interactions with a human tutor. In order to provide such a system, we have extended the existing Babel framework [Steels and Loetzsch, 2009], which has been designed to model grounded communication and cognitive learning processes in a multi-agent population, so that a human can take up the role of a tutor and provide input for the agent learner through a GUI. Hereby, an agent’s language skills are constantly being shaped due to the varia-tion that is present in the situations he encounters and the corresponding linguistic input he receives from the tutor.

Santiago Ontañón, Jose L. Montaña and Avelino Gonzalez. Towards a Unified Framework for Learning from Observation. In 2011 IJCAI Workshop on Agents Learning Interactively from Human Teachers (ALIHT), July 2011. PDF

Abstract: This paper discusses the recent trends in machine learning towards learning from observation (LfO).  These reflect a growing interest in having computers learn as humans do - by observing and thereafter imitating the performance of a task or an action. We discuss the basic foundation of this field and the early research in this area. We then proceed to characterize the types of tasks that can be learned from observation and how to evaluate an agent created in this manner. The main contribution of this paper is a joint framework that unifies all previous formalizations of LfO.

Raquel Torres Peralta, Tasneem Kaochar, Ian Fasel, Clay Morrison, Tom Walsh and Paul Cohen. Challenges to Decoding the Intention Behind Natural Instruction (Extended Abstract). In 2011 IJCAI Workshop on Agents Learning Interactively from Human Teachers (ALIHT), July 2011. PDF

Abstract: Currently, most systems for human-robot teaching allow only one mode of teacher-student interaction (e.g., teaching by demonstration or feedback), and teaching episodes have to be carefully set-up by an expert.  To understand how we might integrate multiple, interleaved forms of human instruction into a robot learner, we performed a behavioral study in which 44 untrained humans were allowed to freely mix interaction modes to teach a simulated robot (secretly controlled by a human) a complex task. Analysis of transcripts showed that human teachers often give instructions that require considerable interpretation and are not easily translated into a form useable by machine learning algorithms. In particular, humans often use implicit instructions, fail to clearly indicate the boundaries of procedures, and tightly interleave testing, feedback, and new instruction.  In this paper, we detail these teaching patterns and discuss the challenges they pose to automatic teaching interpretation as well as the machine-learning algorithms that must ultimately process these instructions.  We highlight the challenges by demonstrating the difficulties of an initial automatic teacher interpretation system.

Leonel Rozo, Pablo Jimenez and Carme Torras. Robot Learning from Demonstration in the Force Domain. In 2011 IJCAI Workshop on Agents Learning Interactively from Human Teachers (ALIHT), July 2011. PDF

Abstract: Researchers are becoming aware of the importance of other information sources besides visual data in robot learning by demonstration (LbD). Force-based perceptions are shown to convey very relevant information missed by visual and position sensors for learning specific tasks. In this paper, we review some recent works using forces as input data in LbD and Human-Robot interaction (HRI) scenarios, and propose a complete learning framework for teaching force-based manipulation skills to a robot through a haptic device. We suggest to use haptic interfaces not only as a demonstration tool but also as a communication channel between the human and the robot, getting the teacher more involved in the teaching process by experiencing the force signals sensed by the robot. Within the proposed framework, we provide solutions for treating force signals, extracting relevant information about the task, encoding the training data and generalizing to perform successfully under unknown conditions.

Edouard Klein, Matthieu Geist and Olivier Pietquin. Batch, Off-policy and Model-Free Apprenticeship Learning. In 2011 IJCAI Workshop on Agents Learning Interactively from Human Teachers (ALIHT), July 2011. PDF

Abstract: This paper addresses the problem of apprenticeship learning, that is learning control policies from demonstration by an expert. An efficient framework for it is inverse reinforcement learning (IRL). Based on the assumption that the expert maximizes a utility function, IRL aims at learning the underlying reward from example trajectories. Many IRL algorithms assume that the reward function is linearly parameterized and rely on the computation of some associated feature expectations, which is done through Monte Carlo simulation. However, this assumes to have full trajectories for the expert policy as well as at least a generative model for intermediate policies. In this paper, we introduce a temporal difference method, namely LSTD-μ, to compute these feature expectations. This allows extending apprenticeship learning to a batch and off-policy setting

Amit Kumar Pandey and Rachid Alami. Towards Task Understanding through Multi-State Visuo-Spatial Perspective Taking for Human-Robot Interaction. In 2011 IJCAI Workshop on Agents Learning Interactively from Human Teachers (ALIHT), July 2011. PDF

Abstract: For a lifelong learning robot, in the context of task understanding, it is important to distinguish the ‘meaning’ of a task from the ‘means’ to achieve it. In this paper we will select a set of tasks in a typical Human-Robot interaction scenario such as show, hide, make accessible, etc., and illustrate that visuo-spatial perspective taking can be effectively used to understand such tasks’ semantics in terms of ‘effect’. The idea is, for understanding the ‘effects’ the robot analyzes the reachability and visibility of an agent not only from the current state of the agent but also from a set of virtual states, which the agent might attain with different level of efforts from his/its current state. We show that such symbolic understandings of tasks could be generalized to new situations or spatial arrangements, as well as facilitate 'transfer of understanding’ among heterogeneous robots. Robot begins to understand the semantics of the task from the first demonstration and continuously refines its understanding with further examples.

Manuel Lopes, Thomas Cederbourg and Pierre-Yves Oudeyer. Simultaneous Acquisition of Task and Feedback Models. In 2011 IJCAI Workshop on Agents Learning Interactively from Human Teachers (ALIHT), July 2011. PDF

Abstract: We present a system to learn task representations from ambiguous feedback. We consider an inverse reinforcement learner that receives feedback from a user with an unknown and noisy protocol. The system needs to estimate simultaneously what the task is, and how the user is providing the feedback. We further explore the problem of ambiguous protocols by considering that the words used by the teacher have an unknown relation with the action and meaning expected by the robot. This allows the system to start with a set of known symbols and learn the meaning of new ones. We present computational results that show that it is possible to learn the task under a noisy and ambiguous feedback. Using an active learning approach, the system is able to reduce the length of the training period.

Kaushik Subramanian, Charles L. Isbell and Andrea L. Thomaz. Learning Options through Human Interaction. In 2011 IJCAI Workshop on Agents Learning Interactively from Human Teachers (ALIHT), July 2011. PDF

Abstract: Hierarchical Reinforcement Learning solves problems by decomposing them into a set of sub-tasks or options. In this paper we develop a method of soliciting options from everyday people. We show how humans design actions that naturally decompose the problem, making them compatible with the options framework. We instantiate our approach in the Taxi domain and Pac-Man and show that the human-derived options outperform automated methods of option extraction both in terms of optimality and computation time. Our experiments show that human-options are generalizable across the state space of the problem. Further analysis shows that some decompositions given by people often do not have a clear termination state. As such, we discuss the potential use of Modular Reinforcement Learning to approach the problem.

Raquel Fernández, Staffan Larsson, Robin Cooper, Jonathan Ginzburg and David Schlangen. Reciprocal Learning via Dialogue Interaction: Challenges and Prospects. In 2011 IJCAI Workshop on Agents Learning Interactively from Human Teachers (ALIHT), July 2011. PDF

Abstract: Humans learn to communicate with each other by engaging in ``language coordination'' during dialogue. In this position paper, we present the main ideas behind the challenge of language coordination in human-machine interaction. We review relevant empirical evidence and current approaches to learning conversational agents, and identify some of the problems that must be overcome for realising this challenge.

Sao Mai Nguyen, Adrien Baranes and Pierre-Yves Oudeyer. Constraining the Size Growth of the Task Space with Socially Guided Intrinsic Motivation using Demonstrations. In 2011 IJCAI Workshop on Agents Learning Interactively from Human Teachers (ALIHT), July 2011. PDF

Abstract: This paper presents an algorithm for learning a highly redundant inverse model in continuous and non-preset environments. Our Socially Guided Intrinsic Motivation by Demonstrations (SGIM-D) algorithm combines the advantages of both social learning and intrinsic motivation, to specialise in  a wide range of skills, while lessening its dependence on the teacher. SGIM-D is evaluated on a fishing skill learning experiment.

Keith Sullivan and Sean Luke. Multiagent Supervised Training with Agent Hierarchies and Manual Behavior Decomposition. In 2011 IJCAI Workshop on Agents Learning Interactively from Human Teachers (ALIHT), July 2011. PDF

Abstract: We present a supervised learning from demonstration system capable of training stateful and recurrent behaviors, both in the single agent and multiagent case.  Furthermore, behavior complexity due to statefulness and multiple agents can result in a high dimensional learning space, which can require many samples to learn properly.   However in real-time behavior training of this sort, samples are potentially expensive to collect.  Our approach, which relies heavily on both per-agent behavior decomposition and structuring the multiple agents into a tree hierarchy, can significantly reduce the number of samples and make such training feasible. We demonstrate our system in a simulated collective foraging task where all the agents execute the same behavior set. We also discuss how to extend our approach to a heterogeneous case, where different subgroups of agents perform different behaviors.    

Igor Karpov, Vinod Valsalam and Risto Miikkulainen. Assisting Machine Learning Through Shaping, Advice and Examples. In 2011 IJCAI Workshop on Agents Learning Interactively from Human Teachers (ALIHT), July 2011. PDF

Abstract: Many different methods for combining human expertise with machine learning in general, and evolutionary computation in particular, are possible. Which of these methods work best, and do they outperform human design and machine design alone? In order to answer this question, a human-subject experiment for comparing human-assisted machine learning methods was conducted. Three different approaches, i.e. advice, shaping, and demonstration, were employed to assist a powerful machine learning technique (neuroevolution) on a collection of agent training tasks, and contrasted with both a completely manual approach (scripting) and a completely hands-off one (neuroevolution alone). The results show that, (1) human-assisted evolution outperforms a manual scripting approach, (2) unassisted evolution performs consistently well across domains, and (3) different methods of assisting neuroevolution outperform unassisted evolution on different tasks. If done right, human-assisted neuroevolution can therefore be a powerful technique for constructing intelligent agents.

Jeremy Ludwig, Alex Davis, Mark Abrams and Jon Curtis. A Hidden Domain for Human and Electronic Students. In 2011 IJCAI Workshop on Agents Learning Interactively from Human Teachers (ALIHT), July 2011. PDF

Abstract: One objective of DARPA’s Bootstrapped Learning (BL) program is to develop a general electronic student (eStudent) that makes use of machine learning algorithms to learn from the kind of focused instruction typically provided by a human teacher. Over the course of the program, such a student was developed and trained against a number of learning challenges codified in electronic curricula spanning disparate domains of interest. In order to test the capabilities of this eStudent, a “hidden domain” curriculum was created. An electronic version of this curriculum was used to test the eStudent by an independent research group. Additionally, university undergraduate and graduate students were trained and tested using a human-accessible version of this same “hidden domain” curriculum, allowing researchers to gauge the performance of humans on the curriculum and use that performance as a benchmark against which to evaluate the eStudent. This paper provides details on the hidden domain curriculum, which teaches  a procedure for the diagnosis and repair of a satellite ground control station. We also present as a summary of the available human and eStudent results.

W. Bradley Knox, Matthew Taylor and Peter Stone. Understanding Human Teaching Modalities in Reinforcement Learning Environments: A Preliminary Report. In 2011 IJCAI Workshop on Agents Learning Interactively from Human Teachers (ALIHT), July 2011. PDF

Abstract: While traditional agent-based learning techniques have enjoyed considerable success, in recent years there has been a growing interest  in improving such learning by leveraging humans as teachers. These human-in-the-loop methods have demonstrated substantial improvements by using human subjects in a variety of interaction modalities. Unfortunately, there are few, if any, guidelines about when one teaching modality is more appropriate than another. In addition to highlighting this important gap in the current literature, this paper presents a pilot study that compares two specific teaching modalities: learning by feedback and learning by demonstration, and proposes a set of hypotheses about their relative performance.