CS 395T: Intelligent Robotics
- Spring 2005. TTh, 3:30 - 5:00 pm
- TAY 3.144
- Unique #53130.
- Professor Benjamin Kuipers
- Office hours: Thursdays, 2:00 - 3:15 and by appointment.
Syllabus
For robots to be intelligent in the way people are intelligent,
they will have to learn about their world, and their own ability to
interact with it, much like people do. This research seminar will
investigate new research directions in robot learning.
Traditionally, robots have been useful in manufacturing by moving
blindly but precisely in totally controlled workcells. Traditionally,
symbolic AI systems have given the appearance of intelligence by
applying logical inference algorithms to symbol structures whose
primitive elements are specified by human programmers. This has left
AI systems open to Searle's famous "Chinese Room" critique, arguing
that they only mimic intelligence: they are merely "faking it".
To answer this philosophical challenge, and to be useful in a host of
real-world application on Earth and in space, AI systems need to be
robots, with sensors and effectors embedded in the physical world.
Not only that, but these robots must learn the nature of their own
sensorimotor interaction with the environment, and must create their
own symbols, grounded in their own experience.
Robots are being created with ever more complex and richly structured
sensors. The sensorimotor system evolves over time, sometimes
deteriorating, but sometimes being augmented with new "plug-and-play"
sensors. Humans are astonishingly adaptable to sensorimotor changes,
and children do an amazing job of learning to use their sensors and
effectors in a few short years after birth. We can learn important
things about robots from research on children. And robot models may
help us create better theories of child development.
We will focus on robot learning of the "foundational domains" that
underlie commonsense knowledge: space, time, actions, objects,
properties and affordances, causality, and so on. We will consider
the foundations of these higher-level theories in low-level perception,
action, and the control laws that bind them together.
Assignments
This is a research seminar, intended first to bring you to the state
of the art, and then to help you do a project and paper of publishable
quality. There will be a significant amount of reading and discussion
of recent research papers that will be handed out.
The requirements of the course will be:
- (35%) Class presentation on one or more papers.
- (15%) Class participation in discussions.
- (50%) Term project, presentation, and paper.
Class Presentations
Each class member will select a topic and present the material to the
class. Each topic will have an associated reading that the entire
class will read, but the presenter is responsible for finding and
reading additional material, becoming an expert in the area, creating
an illuminating example to present, and leading a discussion.
Be prepared to give a 45 minute presentation, followed by specific
questions and more general discussion of the value and importance of
the material presented. If you send me a copy of your slides a couple
of days before your presentation, I will give you feedback as quickly
as I can.
Here is a thematic outline. You don't need to cover the points in
exactly this order, but try to address these needs for your audience.
- What is the problem? Why is it important? Why should the reader care?
- What assumptions are being made?
- How does this method work?
Provide an intuition to guide
the hearer through the technical details.
Then provide a
more detailed example to show what the intuitions mean.
- What are the strengths and limitations of this approach?
- How can you evaluate the benefits?
- What are open problems in this area?
- How does this help us?
Where is the gold?
Prepare PowerPoint or other slides for your presentation. Hand out
copies of your slides to the class before your presentation.
Pick a presentation topic that works well with your term project
topic. The papers will be accessible online through the UT Library,
or via link here. In some cases, you will need to review several
related papers by the authors.
Presentation Topics
Each student will pick one of the sub-bullets, and will be responsible
for presenting and discussing that paper (or papers). The fourth
sub-bullet in each category can only be chosen after the first three
in all other categories have been taken.
- Introduction (2 weeks: Jan 18-27) [Kuipers]
- Visual tracking and symbol anchoring (2 weeks: Feb 1-10)
- O'Regan and Noe, A sensorimotor account of vision and
visual consciousness.
Behavioral and Brain Sciences 24: 939-1031, 2001.
[PDF]
- Ballard, Hayhoe, Pook, and Rao,
Deictic codes for the embodiment of cognition.
Behavioral and Brain Sciences 20: 723-767, 1997.
[PDF]
- Shanahan, Perception as abduction: turning sensor data
into meaningful representation.
Cognitive Science, 2005, to appear.
[PDF]
- Coradeschi and Saffiotti,
An introduction to the anchoring problem.
Robotics and Autonomous Systems 43: 85-96, 2003.
and other papers.
[PDF]
- Language and symbol learning (2 weeks: Feb 15-24)
- Yu and Ballard, On the integration of grounding language
and learning objects. AAAI, 2004.
[PDF]
Yu, Ballard and Aslin, The role of embodied intention
in early lexical acquisition. Cog. Sci. Conf., 2003.
[PDF]
and other papers.
[PDF]
- Roy and Pentland, Learning words from sights and sounds: a computational model.
Cognitive Science 26: 113-146, 2002.
and other papers.
[PDF]
- Steels, The origins of syntax in visually grounded robotic agents.
Artificial Intelligence 103: 133-156, 1998.
and other papers.
[PDF]
- Siskind, Grounding the lexical semantics of verbs in visual
perception using force dynamics and event logic.
J. Artificial Intelligence Research 15: 31-90, 2001.
and other papers.
[Journal]
- Learning sensor organization (2 weeks: Mar 1-10)
- Sensory substitution: artifical vision via tactile sensing.
-
P. Bach-y-Rita, et al, Vision substitution by tactile image projection.
Nature 221: 963-964, 1969.
[PDF]
-
P. Bach-y-Rita, Tactile vision substitution: past and future.
Int. J. Neuroscience 19: 29-36, 1983.
[PDF]
-
P. Bach-y-Rita, The relationship between motor processes and cognition
in tactile vision substitution.
In W. Prinz and A. F. Sanders (Eds.), Cognition and Motor Processes,
Springer-Verlag, 1984, pages 150-160.
[PDF]
-
P. Bach-y-Rita, M. E. Tyler, and K. A Kaczmarek, Seeing with the brain.
Int. J. Human-Computer Interaction 15(2): 285-295, 2003.
[PDF]
- Clustering methods: Kohonen, Fritzke, k-means, EM, ...
- Alsabti, Ranka, Singh, An efficient k-means clustering algorithm.
IPPS/SPDP Workshop on High Performance Data Mining, 1998.
[PDF]
-
Concise description of SOMs, plus a huge bibliography.
- Fritzke, A growing neural gas network learns topologies.
NIPS 1994.
[PDF]
- Linaker and Niklasson, Sensory-flow segmentation using a resource
allocating vector quantizer.
Advances in statistical, structural and
syntactical pattern recognition: Proceedings of Joint IAPR
International Workshops on Syntactical and Structural Pattern
Recognition, 2000.
[PDF]
- Dimensionality reduction: regression, PCA, ICA, and ...
- Hyvarinen, Survey on independent component analysis.
Neural Computing Surveys 2: 94-128, 1999.
[PDF]
- Steyvers, Multidimensional Scaling.
Encyclopedia of Cognitive Science, 2002.
[PDF]
- Tenenbaum, de Silva and Langford,
A global geometric framework for nonlinear dimensionality reduction.
Science 290: 2319-2323, 2000.
[PDF]
- Roweis and Saul,
Nonlinear dimensionality reduction by locally linear embedding.
Science 290: 2323-2326, 2000.
[PDF]
- Learning and using hidden units as meaningful features:
Caruana, Multitask learning.
In Thrun and Pratt (Eds.), Learning To Learn, 1998.
and other papers.
[PDF]
- Abstraction from pixels to objects (2 weeks: Mar 22-31)
- Learning object recognition:
Pietro Perona and students at Cal Tech,
CVPR'00, ECCV'00, CVPR'03, GMBV'04, CVPR'04.
- Drescher-style schema learning:
Harold Chaput, Constructivist Learning Architecture
from UT Austin.
- Hierarchical reinforcement learning:
Barto and Mahadevan,
Recent Advances in Hierarchical Reinforcement Learning,
Discrete Event Dynamic Systems 13(4):41-77, 2003.
[PDF]
Amy McGovern at U. Oklahoma (formerly U. Mass Amherst).
Mark Ring, classic PhD dissertation from UT Austin.
Also Ring, Machine Learning, 1997.
[PDF]
-
Juergen Schmidhuber, IDSIA in Switzerland.
- Insights from child development (2 weeks: Apr 5-14)
- Johnson, Object perception and object knowledge in young infants.
[PDF]
Slater, Innate organization and early learning in infant visual perception.
[PDF]
Hainline, The development of basic visual abilities.
[PDF]
All in Slater (Ed.), Perceptual Development, 1998.
- Baillargeon, How do infants learn about the physical world?
Current Directions in Psychological Science 3: 133-140, 1994.
[PDF]
Spelke, Initial knowledge: six suggestions.
Cognition 50: 431-445, 1994.
Spelke, Principles of object perception.
Cognitive Science, 14: 29-56, 1990.
- L. B. Cohen, et al, The development of infant causal perception.
In Slater (Ed.), Perceptual Development, 1998.
[PDF]
L. B. Cohen, et al, new review.
- P. R. Cohen, Oates, Beal and Adams, Contentful mental states for robot baby, AAAI, 2002.
[PDF]
P. R. Cohen, Atkin, Oates and Beal,
Neo: Learning conceptual knowledge by sensorimotor interaction
with an environment, Agents'97.
[PDF]
and other papers.
- Short presentations of term project results (3 weeks: Apr 19 - May 5)
Term Projects
Each class member will do a term project. You can apply a method we
are learning about to a robot learning problem. Or you can extend an
existing method or develop a new method to solve a problem. Ideally,
your term project will extend the state of the art, and will be
suitable for submission to AAAI, ICRA, IROS or some other major
conference.
You are encouraged to select a topic that fits well with your other
research interests.
Possible Project Topics
The following is a partial list of topics for investigation. More can
be added, and you can propose ideas of your own.
- Sensor organization. Suppose a robot wakes up in an
unknown world with an unknown sense vector and unknown motor vector.
How does it learn from experience how its sensors are structured, and
how its motor commands affect its sensory input? As robots get more
complex and long-lived, it will become essential for them to learn the
properties of their own sensorimotor system. There is a growing
literature of techniques for nonlinear dimensionality reduction that
provides tools for tackling this problem.
- Self-calibration of sensors and actions. How are the
intrinsic units of laser range-finders related to the intrinsic units
of sonars, or of the robot's odometry? How do we recognize that all
of these are related to distance? Can we learn that a bump sensor
recognizes the presence of an obstacle at a certain close range, even
if it is too close for the range sensors? In general, how can a robot
calibrate its sensors and its actions against each other, without a
given external reference?
- Learning foveated vision. Suppose we have visual
input from a retina where receptors are much denser in a certain
region (the fovea) and much less dense towards the periphery.
Can we learn the structure of the visual sensor, as well as an
undistorted model of the space being sensed? What if we also include
the biological fact that cones (the color sensors) are very rich in
the fovea, while almost absent in the periphery?
- Learning stereo. Suppose we have two foveated eyes,
whose individual structures we have learned. Can we learn to track
individual visual features or distinctive objects with both eyes,
maximizing the correspondence between the two visual images? Can we
then learn to identify the discrepancies between the two maximally
corresponding images, and recognize that those discrepancies are
the key to learning disparity, the attribute of corresponding
features in paired images that supports the stereo range inference?
- Control laws, sensory features, and actions. Given a set
of low-level sensor inputs and motor commands, how do we learn a
tractable set of control laws coupling them together? How can we
abstract the low-level (``pixel-level'') observations and motor commands
into higher-level features and actions?
- Local metrical maps. Given low-level egocentric
range-sensor input, and simple motor commands, how can a robot
discover that its observations can be explained by localizing itself
in a local, world-centered frame of reference and building an
occupancy grid model of its surroundings? Essentially, we are asking
the robot to discover world-centered rather than egocentric frames of
reference, the concept of localization within that frame of reference,
and a solution to the SLAM (simultaneous localization and mapping)
problem, used as an abductive inference.
- Places. When exploring a continuous environment, how do
we abstract the continuous space into a discrete set of places
linked by path segments? A place is not just any location,
even considering the neighborhood around it. The basic SSH defined
places as sets of closely linked distinctive states, which are
local optima of local hill-climbing control laws. The hybrid SSH
defines a place as a neighborhood where unambiguous localization is
relatively easy. It also assumes that most or all places are decision
points among alternate paths. We propose a criterion for abstracting
places based on the Extended Voronoi Graph in
[Beeson, et al, ICRA, 2005]. Once we have defined a set of places
and the connections among them, it is useful to learn how to recognize
a place from its sensory image. We presented a nice bootstrap learning
method for learning place recognition in
[Kuipers and Beeson, AAAI, 2002].
- Objects and actions. Given low-level sensor input and
motor commands, how does an agent learn to explain aspects of its
world in the higher-level terms of objects, and actions
that can be applied to those objects? Objects can be initially
proposed based on clustering of sensor returns in space, and tracking
the clusters in time
[Modayil and Kuipers, IROS, 2004]. The next step is to learn
higher-level actions that can be applied to those objects.
- Prerequisites and consequences of reliable actions.
Gary Drescher's schema mechanism can be seen as a method for searching
for the pre-requisites and consequences of actions, evaluating schemas
for their reliability. Can we apply these ideas to our action-learning
problems, when dealing with control laws, or with objects?
- Explaining consciousness. Can we give a computational
account of consciousness, such that it is meaningful for a computer to
be conscious? One aspect of consciousness, well described in Marvin
Minsky's famous essay, "Matter, Mind, and Models", is for the agent to
have a model of its own cognitive processes, so that it can store,
recall, and reason about its own mental operations. This no longer
seems problematical for a computational system such as a robot. The
other key problem of consciousness is qualia, the feeling of
perceptions such as "red". There has been a lot of recent work on
the problem of consciousness, and we will discuss some of that work,
along with some new ideas.
- Boundaries between evolutionary, developmental, and individual
learning. Much of what we call learning takes place when an
individual confronts a problem and learns to solve it better. But
there is also learning that takes place over evolutionary time, as the
species learns to solve certain problems better through natural
selection. And there is an intermediate state, where learning takes
place during the developmental process of an infant or child,
sometimes even befor birth. Are these qualitatively distinct kinds of
learning, or are they all instances of the same process, just
accomplished at different points in history? Can we look at the
evidence of evolutionary change and developmental learning, and
draw useful inferences about these processes?
- Representing continuous quantities. In very low-level
learning and cognition, we face the foundational problem of how to
represent basic directly-perceived quantities, ranging from distance
and angle, to color or texture, to the magnitude of a sensation such
as weight or loudness. What is stored? How can it be retrieved?
Textbooks
Our own work on these problems starts with
[Pierce and Kuipers, AIJ, 1997], which learns the foundations
for the Spatial Semantic Hierarchy
[Kuipers, AIJ, 2000]. It would be helpful to read these in advance.
The following two books are required reading for the course.
They both contain important ideas that we will be discussing.
Both are good to start reading in advance.
- Gary L. Drescher. 1991.
Made-Up Minds: A Constructivist Approach
to Artificial Intelligence.
Cambridge, MA: MIT Press.
- George Lakoff and Mark Johnson. 2003.
Metaphors We Live By, second edition.
University of Chicago Press.
Valuable books for your library
The following are some useful books that you should have in your
professional library, and that are related to this course. I will
assume that you have immediate access to material in these books.
- Duda, Hart and Stork. 2001.
Pattern Classification, Second Edition.
NY: John Wiley and Sons.
- Tom Mitchell. 1997. Machine Learning.
Boston: McGraw-Hill.
(This book is a useful reference, and is the required text for
Ray Mooney's Machine Learning course.)
If you do not already have a background in Artificial Intelligence,
the following excellent textbook would be another valuable addition to
your library, and is undoubtedly available used.
- Stuart Russell and Peter Norvig. Artificial
Intelligence: A Modern Approach. Prentice-Hall.
Some assignment and project may be best done in a high-level
programming environment such as R, MATLAB, or LabVIEW.
Make sure you have any documentation you need.
The Computer Science Department has a Code of Conduct that describes
the obligations of faculty and students. Read it at
http://www.cs.utexas.edu/users/ear/CodeOfConduct.html.
BJK