next up previous
Next: The System Up: Face Recognition by Dynamic Previous: Face Recognition by Dynamic

Introduction

The intracortical wiring pattern is a fascinating scientific subject, as it seems to hold the key to the function of the brain, or the part of it that we are accustomed to take most seriously. That wiring pattern is unnervingly close to being all-to-all. It has been speculated that signals from any cell in cortex can reach any other by crossing just three synapses. Although this seems to make sense for a system in which any two data items may have to contact each other, near-to-complete wiring seems to leave little room for all the specific structure that according to our present view of the brain resides in its connections. The experimental techniques of anatomy and neurophysiology are much too limited to give us more than gross principles of a cortical wiring pattern. These principles are to a very large extent summarized by speaking about receptive field structures, columnar organization, regular local interactions of the general type of difference-of-Gaussians and topographical connection patterns between areas. Beyond that we are in a dark continent, which may, for all we know, be dominated by randomness. More likely, however, it is structured by intricate learned patterns that are too variable from individual to individual and from place to place to ever become a possible subject of experimental enquiry. All we can hope to learn is the principles of organization by which they are formed.

We are presenting here a model for invariant object recognition, together with tests on human face recognition from a large gallery. The model may be relevant to the discussion at hand since it makes minimal assumptions about genetically generated connection patterns --- certainly none that go beyond the principles enumerated --- and relies largely on rapid reversible synaptic self-organization during the recognition process to create the much more specific connections required for a concrete recognition act. The model relies on Dynamic Link Matching (DLM) the qualitative principle of which has been described before [7,8,16,17]. The model described here goes beyond previously published versions in being more complete in its dynamic formulation, including mechanisms for autonomous activity blob dynamics, attention dynamics, and dynamic interaction between the stored models to implement the actual decision process during recognition.

A few words are in order to relate the jargon used in the description of our model to the biological background (the reader may want to come back to this ``dictionary'' while reading the System section). The term image refers to a cortical image domain which corresponds to the primary visual cortex V1 and possibly also to other areas up to perhaps V4. The image or image domain has the form of a graph. The nodes of the graph correspond to hypercolumns, that is, to collections of those feature specific neurons that are activated from one retinal point. In our system we formalize the activity of the sets of feature cells within hypercolumns as jets. As features we use Gabor-based wavelets. The links of the image graph correspond to lateral connections between nodes. An image on the retina selects a subset of the feature cells in the image domain. The selected neurons are then stochastically activated (these fluctuations not being driven by the visual signal). It is important that this stochastic activity takes a form that is characterized by temporal short-range correlations. These correlations express the neighborhood relations of visual features in the image and are produced by the lateral connections within the image domain. In our specific system the stochastic signal in the image domain (and also in the model domain) has the form of a local running blob of activity that is confined to an attention window. Apart from the local correlations the details of the activity process are not important, however.

The models (see right side of Figure 1) collectively form the model domain. We imagine this to be identified with some part of inferotemporal cortex. The nodes of the models again have the form of jets and are collections of neurons carrying feature labels. They are laterally connected much like nodes in the image domain. In our system the different models are totally disjoint. In the biological case models are likely to have partial overlap, in terms of single nodes or even partial networks. The stochastic activity process in the models is similar to that in the image domain, except for the interactions between models, which have the form of local co-operation (correlating activity between structurally corresponding points) and global competition between entire models.

The image domain and the model domain are bi-directionally connected by dynamic links. These correspond to connections between primary and infero-temporal cortex. These connections are assumed to be plastic on a fast time scale (changing radically during a single recognition event), this plasticity being reversible. The strength of a connection between any two nodes in the image and a model is controlled by the jet similarity between them, which roughly corresponds to the number of features that are common to the two nodes.


next up previous
Next: The System Up: Face Recognition by Dynamic Previous: Face Recognition by Dynamic