Model-Based Vision on a Legged Robot

This page discusses two research contributions regarding vision and localization on a legged robot. In the first, the robot's visual processing is dramatially sped up by the use of selective visual attention. In the second, we compare two methods for vision and localization on a legged robot, one based on the robot's expectations, and the other based on object detection and Monte Carlo localization.

Selective Visual Attention for Object Detection on a Legged Robot

Autonomous robots can use a variety of sensors, such as sonar, laser range finders, and bump sensors, to sense their environments. Visual information from an onboard camera can provide particularly rich sensor data. However, processing all the pixels in every image, even with simple operations, can be computationally taxing for robots equipped with cameras of reasonable resolution and frame rate. This paper presents a novel method for a legged robot equipped with a camera to use selective visual attention to efficiently recognize objects in its environment. The resulting attention-based approach is fully implemented and validated on an Aibo ERS-7. It effectively processes incoming images 50 times faster than a baseline approach, with no significant difference in the efficacy of its object detection.

Performing selective visual attention on a legged robot is particularly difficult because of the jagged motion in the camera image caused by walking. This motion is illustrated in the following video:

Jagged Motion

Full details of our approach are available in the following paper:

A Comparison of Two Approaches for Vision and Self-Localization

This work considers two approaches to the problem of vision and self-localization on a mobile robot. In the first approach, the perceptual processing is primarily bottom-up, with visual object recognition entirely preceding localization. In the second, significant top-down information is incorporated, with vision and localization being intertwined. That is, the processing of vision is highly dependent on the robot's estimate of its location. The two approaches are implemented and tested on a Sony Aibo ERS-7 robot, localizing as it walks through a color-coded test-bed domain. This paper's contributions are an exposition of two different approaches to vision and localization on a mobile robot, an empirical comparison of the two methods, and a discussion of the relative advantages of each method.

The following video demonstrates the second approach for vision and localization, as the robot steps in place and scans its head from side to side. In each frame, the orange lines correspond to the final camera pose estimate. The dark blue lines (primarily visible in the background) depict the observed edges detected in the image.

Animated Gif

This work is described in more detail in the following paper.

Valid CSS!
Valid XHTML 1.0!