Representing Visual Schemas in Neural Networks for Scene Analysis (1993)
Using object recognition in simple scenes as the task, this research focuses on two fundamental problems in neural network systems: (1) processing large amounts of input with limited resources, and (2) the representation and use of structured knowledge. The first problem arises because no practical neural network can process all the visual input simultaneously and efficiently. The solution is to process a small amount of the input in parallel, and successively focus on other parts of the input. This strategy requires that the system maintains structured knowledge for describing and interpreting successively gathered information. The proposed system, VISOR, consists of two main modules. The Low-Level Visual Module (simulated using procedural programs) extracts featural and positional information from the visual input. The Schema Module (implemented with neural networks) encodes structured knowledge about possible objects, and provides top-down information for the Low-Level Visual Module to focus attention at different parts of the scene. Working cooperatively with the Low-Level Visual Module, it builds a globally consistent interpretation of successively gathered visual information.
In Proceedings of the IEEE Conference on Neural Networks (ICNN-93), pp. 1612-1617, San Francisco, CA 1993. Piscataway, NJ: IEEE.

Wee Kheng Leow Ph.D. Alumni leowwk [at] comp nus edu sg
Risto Miikkulainen Faculty risto [at] cs utexas edu