Argus

Real-time vision for mobile robots

The Real-time Vision Working Group is a subset of the Spatial Reasoning and Robotics Group, which is itself a subset of the Qualitative Reasoning Group in the UT Artificial Intelligence Lab.

Members of the group:

Dr. Benjamin Kuipers, principal investigator (kuipers@cs.utexas.edu)
Rob Browning (rlb@cs.utexas.edu)
Bill Gribble (grib@cs.utexas.edu)

We are working to build real-time vision systems for embodied agents. In particular, we want to use vision to acquire useful knowledge about space for robot navigation, map-building, and manipulation tasks.

Research directions

Our approach to visual search and object recognition is a combination of model-guided image exploration and reactive control of imaging parameters. A perceptual schema is a type of object model which is essentially a graph of primitive features such as lines and corners and their spatial relationships in 2 dimensions. The image above shows a snapshot of a simple perceptual schema in operation which recognizes rectangles as two horizontal lines and two vertical lines in the correct configuration.

The image shows a group of four primitive feature trackers tracking the line segments that make up a rectangle in the image. Each primitive feature tracker is a software agent which operates on only the image data outlined by the rectangular box. A reactive control law adjusts the window shape, size, and position to keep the segment feature in view. A collection of these trackers can be distributed across a network of processors; each one requires a maximum of 1k pixels pre frame, so the total bandwidth usage is small.

Current work is integrating biological models of attention direction into schemas to allow efficient visual search.

ARGUS

We have developed a distributed framework for visual behaviors called ARGUS (here's a technical overview of the system). ARGUS addresses the problem of visual search and tracking with an uncalibrated camera and limited computational resources.

Current directions of research with ARGUS include:

Incorporating vision into the Spatial Semantic Hierarchy (SSH) approach to spatial reasoning. A mobile robot can use visual landmarks to more reliably identify its position and navigate through the world.
Developing a framework for hand-eye coordination skills using our Rhino arm and an eye-in-hand configuration camera. The manipulation of physical objects defines a space of actions and perceptions that is similar to that defined by navigation in large-scale space.
Learning to interact with simple objects (infant toys which make noise and give other types of feedback). Learning the state transitions that occur while interacting with a toy is very closely related to learning the sequence of views and actions that the SSH approach uses to build a map of the world. A "map" of toy-interaction space describes the ways that the robot can manipulate the toy and what the outcome of those manipulations will be.

Computational resources

Most of the lower levels of ARGUS are written in C++ using posix threads. and the The Standard Template Library (STL). The higher level code is being developed in a threaded, object oriented variant of Scheme, RScheme, which is under development here in the OOPS Research Group.

Publications

[Robotics Home]

Author: Patrick Beeson