Ian Fasel, Michael Quinlan, Peter Stone. "A General Purpose Task SpeciÞcation Language for Bootstrap Learning". The University of Texas at Austin, Department of Computer Sciences. Report# AI08-1 (ai technical report). May 9, 2008. 6 pages.
Reinforcement learning (RL) is an effective framework for online learning by autonomous agents. Most RL research focuses on domain-independent learning algorithms, requiring an expert human to define the environment (state and action representation) and task to be performed (e.g., start state and reward function) on a case-by-case basis. In this paper, we describe a general language for a teacher to specify sequential decision making tasks to RL agents. The teacher may communicate properties such as start states, reward functions, termination conditions, successful execution traces, task decompositions, and other advice. The learner may then practice and learn the task on its own using any RL algorithm. We demonstrate our language in a simple GridWorld example and on the RoboCup soccer keepaway benchmark problem. The language forms the basis of a larger ``Bootstrap Learning'' model for machine learning, a paradigm for incremental development of complete systems through integration of multiple machine learning techniques.
Sudheendra Vijayanarasimhan and Kristen Grauman. "Multi-Level Active Prediction of Useful Image Annotations for Recognition". The University of Texas at Austin, Department of Computer Sciences. Report# AI08-2 (ai technical report). May 15, 2008.
We introduce a framework for actively learning visual categories from a mixture of weakly and strongly labeled image examples. In order to alleviate the burden of intensive supervision, we propose to allow the category-learner to strategically choose what annotations it receivesÑbased not only on the expected reduction in uncertainty, but also on the predicted tradeoff in the relative costs of obtaining each annotation. We construct a multiple-instance discriminative clas- sifier based on the initial training data. Then all remaining unlabeled and weakly labeled examples are surveyed to actively determine which annotation ought to be requested next. Once the chosen annotation is requested, the user's response is used to incrementally update the current classifier before the next active selec- tion is made. Unlike previous work, our approach accounts for the fact that the optimal use of manual annotation may call for a combination of annotations at multiple levels (e.g., a full segmentation on some images and a present/absent flag on others). We develop a decision-theoretic value function that weighs the expected information gain against the cost of obtaining each label. As a result, it is possible to learn more accurate category models with a lower total expenditure of manual annotation effort.
Yong Jae Lee and Kristen Grauman. "Foreground Focus: Finding Meaningful Features in Unlabeled Images". The University of Texas at Austin, Department of Computer Sciences. Report# AI08-3 (ai technical report). May 21, 2008. 10 pages.
We present a method to automatically discover meaningful features in unlabeled image collections. Each image is decomposed into semi-local features that describe neighborhood appearance and geometry. The goal is to determine for each image which of these parts are most relevant, given the image content in the remainder of the collection. Our method first computes an initial image-level grouping based on feature correspondences, and then iteratively refines cluster assignments based on the evolving intra-cluster pattern of local matches. As a result, the significance attributed to each feature influences an imageÕs cluster membership, while related images in a cluster affect the estimated significance of their features. We show that this mutual reinforcement of object-level and feature-level similarity improves unsupervised image clustering, and apply the technique to automatically discover categories and foreground regions in images from benchmark datasets.
Yong Jae Lee and Kristen Grauman. "Discovering Multi-Aspect Structure to Learn From Loosely Labeled Image Collections". The University of Texas at Austin, Department of Computer Sciences. Report# AI08-4 (ai technical report). May 26, 2008. 13 pages.
Tremendous amounts of images readily available on the Web have simple annotations and tags. However, at best these tags can be considered as "loose labels", since the richness of the visual information in the images far exceeds the description a few words can provide. While the coarse tags do not translate to a unified visual concept---making it difficult to directly use tagged images to train standard classifiers---methods capable of reproducing such high-level annotations on similar images would be quite useful. We propose a new technique to deal with the high intra-class variability common among such collections. We develop a spectral clustering approach that detects the natural sub-classes present in a pool of loosely labeled images based on their consistency in appearance and local spatial layout. These automatically discovered sub-classes are then used to train multiple finer-scale classifiers, which makes it possible to learn the different local properties of the high-level visual category that could perplex a single global learner trained from the entire pool. We compare our method with relevant baselines on multiple datasets, and demonstrate the value of learning to separately discriminate the multi-aspect structures.
Matthew E. Taylor. "Autonomous Inter-Task Transfer in Reinforcement Learning Domains". The University of Texas at Austin, Department of Computer Sciences. Report# AI08-5 (ai dissertation). July 10, 2008. 319 pages.
Reinforcement learning (RL) methods have become popular in recent years because of their ability to solve complex tasks with minimal feedback. While these methods have had experimental successes and have been shown to exhibit some desirable properties in theory, the basic learning algorithms have often been found slow in practice. Therefore, much of the current RL research focuses on speeding up learning by taking advantage of domain knowledge, or by better utilizing agents' experience. The ambitious goal of transfer learning, when applied to RL tasks, is to accelerate learning on some target task after training on a different, but related, source task. This dissertation demonstrates that transfer learning methods can successfully improve learning in RL tasks via experience from previously learned tasks. Transfer learning can increase RL's applicability to difficult tasks by allowing agents to generalize their experience across learning problems. This dissertation presents inter-task mappings, the first transfer mechanism in this area to successfully enable transfer between tasks with different state variables and actions. Inter-task mappings have subsequently been used by a number of transfer researchers. A set of six transfer learning algorithms are then introduced. While these transfer methods differ in terms of what base RL algorithms they are compatible with, what type of knowledge they transfer, and what their strengths are, all utilize the same inter-task mapping mechanism. These transfer methods can all successfully use mappings constructed by a human from domain knowledge, but there may be situations in which domain knowledge is unavailable, or insufficient, to describe how two given tasks are related. We therefore also study how inter-task mappings can be learned autonomously by leveraging existing machine learning algorithms. Our methods use classification and regression techniques to successfully discover similarities between data gathered in pairs of tasks, culminating in what is currently one of the most robust mapping-learning algorithms for RL transfer. Combining transfer methods with these similarity-learning algorithms allows us to empirically demonstrate the plausibility of autonomous transfer. We fully implement these methods in four domains (each with different salient characteristics), show that transfer can significantly improve an agent's ability to learn in each domain, and explore the limits of transfer's applicability.
Patrick Beeson, Joseph Modayil, and Benjamin Kuipers. "Factoring the mapping problem: Mobile robot map-building in the Hybrid Spatial Semantic Hierarchy". The University of Texas at Austin, Department of Computer Sciences. Report# AI08-6 (ai tech report). October 24, 2008. 57 pages.
We propose a factored approach to mobile robot map-building that handles qualitatively different types of uncertainty by combining the strengths of topological and metrical approaches. Our framework is based on a computational model of the human cognitive map; thus it allows robust navigation and communication within several different spatial ontologies. This paper focuses exclusively on the issue of map-building using the framework. Our approach factors the mapping problem into natural sub-goals: building a metrical representation for local small-scale spaces; finding a topological map that represents the qualitative structure of large-scale space; and (when necessary) constructing a metrical representation for large-scale space using the skeleton provided by the topological map. We describe how to abstract a symbolic description of the robot's immediate surround from local metrical models, how these local symbolic models are combined to build global symbolic models, and how to create a globally consistent metrical map from a topological skeleton by connecting local frames of reference.
Patrick Beeson. "Creating and Utilizing Symbolic Representations of Spatial Knowledge using Mobile Robots". The University of Texas at Austin, Department of Computer Sciences. Report# AI08-7 (ai dissertation). October 24, 2008. 248 pages.
A map is a description of an environment allowing an agent---a human,or in our case a mobile robot---to plan and perform effective actions.From a single location, an agent's sensors can not observe the whole structure of a complex, large environment. For this reason, the agent must build a map from observations gathered over time and space. We distinguish between large-scale space, with spatial structure larger than the agent's sensory horizon, and small-scale space, with structure within the sensory horizon. We propose a factored approach to mobile robot map-building that handles qualitatively different types of uncertainty by combining the strengths of topological and metrical approaches. Our framework is based on a computational model of the human cognitive map; thus it allows robust navigation and communication within several different spatial ontologies. Our approach factors the mapping problem into natural sub-goals: building a metrical representation for local small-scale spaces; finding a topological map that represents the qualitative structure of large-scale space; and (when necessary) constructing a metrical representation for large-scale space using the skeleton provided by the topological map. The core contributions of this thesis are a formal description of the Hybrid Spatial Semantic Hierarchy (HSSH), a framework for both small-scale and large-scale representations of space, and an implementation of the HSSH that allows a robot to ground the large-scale concepts of place and path in a metrical model of the local surround. Given metrical models of the robot's local surround, we argue that places at decision points in the world can be grounded by the use of a primitive called a gateway. Gateways separate different regions in space and have a natural description at intersections and in doorways. We provide an algorithmic definition of gateways, a theory of how they contribute to the description of paths and places, and practical uses of gateways in spatial mapping and learning.
Todd Hester, Michael Quinlan, and Peter Stone. "UT Austin Villa 2008: Standing On Two Legs". The University of Texas at Austin, Department of Computer Sciences. Report# AI08-8 (ai tech report). October 30, 2008.
In 2008, UT Austin Villa entered a team in the first Nao competition of the Standard Platform League of the RoboCup competition. The team had previous experience in RoboCup in the Aibo leagues. Using this past experience, the team developed an entirely new codebase for the Nao. Development took place from December 2007 until the competition in July of 2008. This technical report describes the algorithms and code developed by the team for the 2008 RoboCup competition in Suzhou, China. A major development was a software architecture designed for easy use, extendability, and debugability. On top of this architecture, the team developed modules for vision, localization, motion, and behaviors. These developments provide a strong foundation for our team to compete successfully in the Standard Platform League in future RoboCup competitions.
Jacob Schrum. "Competition Between Reinforcement Learning Methods in a Predator-Prey Grid World". The University of Texas at Austin, Department of Computer Sciences. Report# AI08-9 (ai tech report). November 12, 2008. 14 pages.
Tabular and linear function approximation based variants of Monte Carlo, temporal difference, and eligibility trace based learning methods are compared in a simple predator-prey grid world from which the prey is able to escape. These methods are compared both in terms of how well they lead a prey agent to escape randomly moving predators, and in terms of how well they do in competition with each other when one agent controls the prey and each of the predators is controlled by a different type of agent. Results show that tabular methods, which must use a partial state representation due to the size of the full state space, actually do surprisingly well against linear function approximation methods, which can make use of a full state representation and generalize their behavior across states.
Yong Jae Lee and Kristen Grauman. "Shape Discovery from Unlabeled Image Collections". The University of Texas at Austin, Department of Computer Sciences. Report# AI08-10 (ai tech report). November 21, 2008.
Can we discover common object shapes within unlabeled multi-category collections of images? While often a critical cue at the category-level, contour matches can be difficult to isolate reliably from edge clutter---even within labeled images from a known class. We propose a shape discovery method in which local appearance (patch) matches serve to anchor the surrounding edge fragments, yielding a more reliable affinity function for images that accounts for both shape and appearance. Spectral clustering from the initial affinities provides candidate object clusters. Then, we compute the within-cluster match patterns to discern foreground edges from clutter, attributing higher weight to edges more likely to belong to a common object. In addition to discovering the object contours in each image, we show how to summarize what is found with prototypical shapes. Our results on benchmark datasets demonstrate the approach can successfully discover shapes from unlabeled images.
Jaechul Kim, Kristen Grauman. "Observe Locally, Infer Globally: a Space-Time MRF for Detecting Abnormal Activities with Incremental Updates". The University of Texas at Austin, Department of Computer Sciences. Report# AI08-11 (ai tech report). November 21, 2008.
NO ABSTRACT
Marshall R. Mayberry, III and Risto Miikkulainen. "Incremental Nonmonotonic Sentence Interpretation through Semantic Self-Organization". The University of Texas at Austin, Department of Computer Sciences. Report# AI08-12 (ai tech report). November 24, 2008. 60 pages.
Subsymbolic systems have been successfully used to model several aspects of human language processing. Yet, it has proven difficult to scale them up to realistic language. They have limited memory capacity, long training times, and difficulty representing the wealth of linguistic structure. In this paper, a new connectionist model, INSOMNET, is presented that scales up by utilizing semantic self-organization. INSOMNET was trained on semantic dependency graph representations from the Redwoods Treebank of sentences from the VerbMobil project. The results show that INSOMNET learns to represent these semantic dependencies accurately and generalizes to novel structures. Further evaluation of INSOMNET on the original spoken language transcripts shows that it can also process noisy input robustly, and its performance degrades gracefully when noise is added to the network weights, underscoring how INSOMNET tolerates damage. It interprets sentences nonmonotonically, i.e., it generates expectations and revises them, primes future inputs based on semantics, and coactivates multiple interpretations in the output. In other words, while scaling up it still retains the cognitively valid behavior typical of subsymbolic systems.
Sudheendra Vijayanarasimhan and Kristen Grauman. "What's It Going to Cost You?: Predicting Effort vs. Informativeness for Multi-Label Image Annotations". The University of Texas at Austin, Department of Computer Sciences. Report# AI08-13 (ai tech report). November 20, 2008.
ctive learning strategies can be useful when manual labeling effort is scarce, as they select the most informative examples to be annotated first. However, for visual category learning, the active selection problem is particularly complex: a single image will typically contain multiple object labels, and an annotator could provide multiple types of annotation (e.g., class labels, bounding boxes, segmentations), any of which would incur a variable amount of manual effort. We present an active learning framework that predicts the tradeoff between the effort and information gain associated with a candidate image annotation, thereby ranking unlabeled and partially labeled images according to their expected "net worth" to an object recognition system. We develop a multi-label multiple-instance approach that accommodates multi-object images and a mixture of strong and weak labels. Since the annotation cost can vary depending on an image's complexity, we show how to improve the active selection by directly predicting the time required to segment an unlabeled image. Given a small initial pool of labeled data, the proposed method actively improves the category models with minimal manual intervention.
Aravind Gowrisankar. "Evolving Controllers for Simulated Car Racing Using Neuroevolution". The University of Texas at Austin, Department of Computer Sciences. Report# AI08-14 (ai tech report). December 25, 2008. 85 pages.
Neuroevolution has been successfully used in developing controllers for physical simulation domains. However, the ability to strategize in such domains has not been studied from an evolutionary perspective. This thesis makes the following three contributions. First, it implements Neuroevolution using NEAT with a goal of evolving strategic controllers for the challenging physical simulation domain of car-racing. Second, three different evolutionary approaches are studied and analyzed on their ability to evolve advanced skills and strategy. Though these approaches are found to be good at evolving controllers with advanced skills, discovering high-level strategy proves to be hard. Third, a modular approach is proposed to evolve high-level strategy using Neuroevolution. Given such a suitable task decomposition, Neuroevolution succeeds in evolving controllers capable of strategy by using a modular approach. The simplerace car-racing simulation\cite{togelius:simplerace} is used as a testbed for this study. The results obtained in the car-racing domain suggest that the modular approach can be applied to evolve strategic behavior in other physical simulation domains and tasks.
For help please contact trcenter@cs.utexas.edu