General computer vision and robot vision issues are too broad to deal with here. Finding and tracking independently moving objects (ball, players, judges) and estimating their motion parameters (2-D and further 3-D) from complicated background (field lines, goals, corner poles, flags waved by the supporters in the stadium) is too difficult for the current computer and robot vision technologies to completely perform in real-time.
In order to focus on skill acquisition, visual image processing should be drastically simplified. Discrimination by color information such as a red ball, a blue goal, a yellow opponent makes it easy to find and track objects in real-time [Asada et al. 1996b]. Nevertheless, robust color discrimination is a tough problem because digitized signals are so naive against the slight changes of lighting conditions. In the case of remote (wireless) processing, increased noise due to environmental factors causes fatal errors in image processing. Currently, human programmers adjust key parameters used in discriminating colored objects on site. Self calibration methods should be developed, which will be able to expand the general scope of image processing applications.
Visual tracking hardware based on image intensity correlation inside a window region can be used to find and track objects from the complicated background by setting the initial windows [Inoue, et al., 92]. Currently, a color tracking version is commercially available. As long as the initialized color pattern inside each window does not change much, tracking is almost successful. Coping with pattern changes due to lighting conditions and occlusions is one of the central issues in applying this type of vision hardware.
As long as the vision system can cope with the above issues, and capture the images of both the specified area (the target) and the ball, there might be no problem [Nakamura and Asada1995, Nakamura and Asada1996]. To prevent the agent from losing the target, and/or the ball (in Level II and III, obstacles, too), an active vision system with panning and tilting motions seems preferable, but this makes the control system more complicated and introduces the spatial memory organization problem for keeping track of lost objects. A more practical way is to use a wider-angle lens. One extreme of this sort is to use the omni-directional vision system to capture the image all around the agent. This sort of lens seems very useful not only for acquiring the basic skills but also for realizing cooperative behaviors in multi agent environments. Currently this type of lens is commercially available as spherical and hyperboloidal ones [Ishiguro, 96].