The Spatial Semantic Hierarchy
The cognitive map is the body of knowledge a human or robot has about its large-scale spatial environment. We have developed a computational theory of the human and robotic cognitive map [Kuipers, 1978, 1982; Kuipers and Levitt, 1988; Kuipers and Byun, 1988, 1991]. Our theory of the cognitive map is motivated by observations of human spatial reasoning skills and the characteristic stages of child development [Lynch, 1960; Piaget & Inhelder, 1967; Hart & Moore, 1973]. These studies provide two fundamental insights. First, that a topological description of the environment is central to the cognitive map, and is logically prior to the metrical description. Second, that the spatial representation is grounded in the sensorimotor interaction between the agent and the environment.

Our theory of the cognitive map is based on a hierarchy of representations for spatial knowledge that we call the Spatial Semantic Hierarchy (SSH).

[sensorimotor <-> control] -> procedures -> topology -> geometry.

Each level of the hierarchy has its own ontology (the set of objects and relations is uses for describing the world) and its own set of inference and problem-solving methods. The objects, relations, and assumptions required by each level are provided by those below it.

  • At the control level of the hierarchy, the ontology is an egocentric sensorimotor one, without knowledge of fixed objects or places in an external environment. A distinctive state is defined as the local maximum found by a hill-climbing control strategy, climbing the gradient of a selected sensory feature, or distinctiveness measure. Trajectory-following control laws take the robot from one distinctive state to the neighborhood of the next, where hill-climbing can find a local maximum, reducing position error and preventing its accumulation.

  • At the procedural level of the hierarchy, the ontology consists of views, which describe the sensory images at distinctive states, and actions, which represent trajectories of control laws allowing the robot to move from one view to another. Sets of associations [V,A,V'] among views, actions, and resulting views represent both declarative and imperative knowledge of routes or action procedures.

  • At the topological level of the hierarchy, the ontology consists of places, paths, and regions, with connectivity and containment relations. Relations among the distinctive states and trajectories defined by the control level, and among their summaries as views and actions at the procedural level, are effectively described by the topological network. This network can be used to guide exploration of new environments and to solve new route-finding problems. Using the network representation, navigation is not dependent on the accuracy, or even the existence, of metrical knowledge of the environment.

  • At the geometrical level of the hierarchy, the ontology for places, paths, and sensory features is extended to include metrical properties such as distance, direction, shape, etc. Only at this level are sensory features considered to be transduced properties of the environment. Geometrical features are extracted from sensory input, and represented as annotations on the places and paths of the topological network.

The structure of the Spatial Semantic Hierarchy provides robust performance. The control level definition of states and trajectories grounds the topological description of places and paths, which in turn supports navigation while more expensive sensor fusion methods accumulate metrical information. When metrical information is available, it can be used to optimize travel plans or to disambiguate apparently identical places, but when it is absent navigation and exploration remain possible.

The critical transition from continuous interaction to discrete symbols is provided by the behavior of local control laws.

Within a neighborhood in the environment, if a ``distinctiveness measure'' exists that provides an isolated local maximum for a hill-climbing control law, then the final state of the behavior resulting from following that control law can be considered a ``distinctive state'', which is described by a view at the procedural level, and as a place at the topological level.

Exploration and mapping can be viewed as search for suitable neighborhoods and distinctiveness measures, while accumulating their descriptions.

The Spatial Semantic Hierarchy approach contrasts with more traditional methods, which place geometrical sensor intepretation (the most expensive and error-prone step) on the critical path prior to creation of the topological map [Chatila and Laumond, 1985; Moravec and Elfes, 1985]. Our spatial representation hierarchy is consistent with levels 2 and 3 of Brooks' [1986] subsumption architecture. Research on reactive behavior languages [Brooks, 1990; Gat, 1991] can be viewed as exploring the nature of the control level.

Our fundamental claim is that the Spatial Semantic Hierarchy describes the structure of an agent's spatial knowledge, in a way that is relatively independent of its sensorimotor apparatus and the environment within which it moves. More specifically, an effective symbolic representation of an agent's large-scale spatial environment can be built within the SSH approach, requiring only that the agent's sensorimotor system and its environment satisfy a few conditions. Informally, these conditions are:

  • The agent's sensory inputs must change continuously with its actions, except at isolated discontinuities.

  • It must be possible to define features (distinctiveness measures) over the sensory inputs with sufficiently steep gradients and sufficiently isolated local maxima to support hill-climbing and trajectory-following control laws within local regions.

  • The result of a trajectory-following control law must be a state from which a hill-climbing control law will reach a distinctive state (though occasional failures can be tolerated).

These conditions have been adequate to support implementation of the SSH mapping framework on several different simulated and physical mobile robots.

Progress and Extensions

The Spatial Semantic Hierarchy has supported several productive lines of research.

Improved Formalization (in progress)

First, we are completing a new formalization of the SSH that clarifies its dependence on the properties of the environment and the sensorimotor system of the agent [Kuipers, manuscript]. Since the SSH is an ontological hierarchy, different formalisms are required for the different levels. The control level is formalized using the continuous mathematics of control theory. The procedural and topological levels are formalized in predicate logic. The geometrical level is formalized in terms of state estimators such as Kalman filters. The objects and relations defined at one level depend on assumptions about the results of behaviors at the lower levels. This formalization provides a rigorous description of a sophisticated and important body of human knowledge.

Heterogeneous Control

Second, in order to provide the control level with the power and flexibility it needs, we developed a method called heterogeneous control, for creating complex control laws by composing simpler ones and for proving properties of the resulting laws [Kuipers & Astrom 1991, 1994]. This work has helped us understand the practical success of fuzzy control. Our approach provides an improved way to specify and apply such laws, with a much clearer relationship to traditional control theory, and methods for combining traditional and qualitative analysis to prove properties of heterogeneous controllers. (This work has led to an award under the NSF Intelligent Control initiative.)

High-Speed Motion

Third, we have developed methods for extending the low-speed quasi-static trajectory control laws used by the SSH control level to high-speed dynamic control laws for skilled motion along complex routes [Froom, 1991, 1992]. These methods use the incomplete spatial knowledge represented by the topological and metrical maps to formulate initial trajectory targets for locally-optimal dynamic control. With subsequent practice in the environment, the targets are incrementally refined to improve performance. This method strikes a balance between two unrealistic poles: globally optimal trajectory planning based on perfect knowledge, and overly conservative planning based only on visible obstacles.

Hyper-Redundant Manipulator Planning

Fourth, we took an excursion into path planning for manipulators with many degrees of freedom. This problem is intractable to configuration-space methods, which are exponential in the number of degrees of freedom. Using a continuous model of a many-jointed manipulator, we developed an approximation-based path planning method with complexity polynomial in the complexity of the workspace, and for which the number of joints is a benefit (in the quality of the approximation) rather than an exponential cost [Hayashi & Kuipers, 1991, 1992].

Learning an Unknown Sensorimotor System

Fifth, we are developing methods for learning the sensory features (i.e. distinctiveness measures) and control laws required to support the SSH framework, starting with an uninterpreted sensorimotor system gathering experience in an unknown environment. We have developed a lattice of learning methods based on modest, domain-independent processing assumptions, that solve this problem for a significant class of robots [Pierce & Kuipers, 1990; Pierce, 1991a, 1991b, forthcoming doctoral dissertation]. With these methods, the robot's cognitive map is much more thoroughly grounded in its own sensorimotor experience, rather than being specified by a human programmer.

Physical Robots

Finally, we have been experimenting with physical robot platforms to embody the SSH approach to exploration and mapping. We have demonstrated the ability to follow control laws and identify distinctive places in the indoor office environment. The integration of this control level foundation with the SSH procedural, topological, and metrical levels is currently under way. We have also developed the very substantial hardware and software infrastructure required for physical robotics research [Lee, 1993; Kuipers, et al, 1993].

The bottom line is that we have demonstrated that the Spatial Semantic Hierarchy provides a rigorous description of a foundational domain of knowledge, and can be used to support and integrate a diverse body of research which shares a focus on spatial knowledge.