.REQUIRE "c.pub" SOURCE!FILE .SEND TITLEPAGE $( .BEGIN .CENTER .SKIPTOLINE 10 A COMPUTATIONAL LOGIC .SKIPLINES 5 by Robert S. Boyer and J Strother Moore .FLUSH LEFT .SKIPLINES 5 Copyright SRI International .END .)$ .ACKNOWLEDGE $( Mechanical theorem-proving is crucial to the automation of reasoning about computer programs. Today, few computer programs can be mechanically certified to be free of "bugs." The principal reason is the lack of mechanical theorem-proving power. In current research on automating program analysis, a common approach to overcoming the lack of mechanical theorem-proving power has been to require that the user direct a proof-checking program. That is, the user is required to construct a formal proof employing only the simplest rules of inference, such as __modus ponens_, instantiation of variables, or substitution of equals for equals. The proof-checking program guarantees the correctness of the formal proof. We have found proof-checking programs too frustrating to use because they require too much direction. Another approach to overcoming the lack of mechanical theorem-proving power is to use a weak theorem-proving program and to introduce axioms freely. Often these axioms are called "lemmas," but they are usually not proved. While using a proof checker is only frustrating, introducing axioms freely is deplorable. This approach has been abused so far as to be ludicrous:##we have seen researchers "verify" a program by first obtaining formulas that imply the program's correctness, then running the formulas through a simplifier, and finally assuming the resulting slightly simplified formulas as axioms. Some researchers admit that these "lemmas" ought to be proved, but never get around to proving them because they lack the mechanical theorem-proving power. Others, however, believe that it is reasonable to assume lots of "lemmas" and never try to prove them. We are strongly opposed to this latter attitude because it so completely undermines the spirit of proof, and we therefore reply to the arguments we have heard in its defense. .BULLET It is argued that the axioms assumed are obvious facts about the concepts involved. We say that a great number of mistakes in computer programs arise from false "obvious" observations, and we have already seen researchers present proofs based on false lemmas. Furthermore, the concepts involved in the complicated computer systems one hopes eventually to certify are so insufficiently canonized that one man's "obvious" is another man's "difficult" and a third man's "false." It is argued that one must assume some axioms. We agree, but observe that mathematicians do not contrive their axioms to solve the problem at hand. Yet often the "lemmas" assumed in program verification are remarkably close to the main idea or trick in the program being checked. It is argued that mathematicians use lemmas. We agree. In fact, our theorem-proving system relies heavily on lemmas. But no proof is complete until the lemmas have been proved, too. The assumption of lemmas in program proving often amounts to sweeping under the rug the hard and interesting inferences. It is argued that the definition of concepts necessarily involves the addition of axioms. But the axioms that arise from proper definitions, unlike most "lemmas," have a very special form that guarantees two important properties. First, adding a definition never renders one's theory inconsistent. Second, the definition of a concept involved in the proof of a subsidiary result (but not in the statement of one's main conjecture) can be safely forgotten. It does not matter if the definition was of the "wrong" concept. But an ordinary axiom (or "lemma"), once used, always remains a hypothesis of any later inference. If the axiom is "wrong," the whole proof may be worthless and the validity of the main conjecture is in doubt. .ENDCROWN .)$ .SEND ACKNOWLEDGMENTS $( One reason that researchers have had to assume "lemmas" so freely is that they have not implemented the principle of mathematical induction in their theorem-proving systems. Since mathematical induction is a fundamental rule of inference for the objects about which computer programmers think (e.g., integers, sequences, trees), it is surprising that anyone would implement a theorem prover for program verification that could not make inductive arguments. Why has the mechanization of mathematical induction received scant attention? Perhaps it has been neglected because the main research on mechanical theorem-proving, the resolution theorem-proving tradition (see Chang and Lee {REF CHANG} and Loveland {REF LOVELAND}), does not handle axiom schemes, such as mathematical induction. We suspect, however, that the mechanization of mathematical induction has been neglected because many researchers believe that the only need for induction is in program semantics. Program semantics enables one to obtain from a given program and specification some conjectures ("verification conditions") that imply the program is correct. The study of program semantics has produced a plethora of ways to use induction. Because some programs do not terminate, the role of induction in program semantics is fascinating and subtle. Great effort has been invested in mechanizing induction in program semantics. For example, the many "verification condition generation" programs implicitly rely upon induction to provide the semantics of iteration. But program semantics is not the only place induction is necessary. The conjectures that verification condition generators produce often require inductive proofs because they concern inductively defined concepts such as the integers, sequences, trees, grammars, formulas, stacks, queues, and lists. If you cannot make an inductive argument about an inductively defined concept, then you are doomed to assume what you want to prove. This book addresses the use of induction in proving theorems rather than the use of induction in program semantics. We will present a formal theory providing for inductively constructed objects, recursive definitions, and inductive proofs. Readers familiar with programming languages will see a strong stylistic resemblance between the language of our theory and that fragment of the programming language LISP known as "pure LISP" (see McCarthy et al. {REF LISPMANUAL}). We chose pure LISP as a model for our language because pure LISP was designed as a mathematical language whose formulas could be easily represented within computers. Because of its mathematical nature (e.g., one cannot "destructively transform" the ordered pair <7,3> into <8,3>), pure LISP is considered a "toy" programming language. It is an easy jump to the __non sequitur_:##"The language and theory presented in this book are irrelevant to real program analysis problems because they deal with a toy programming language." But that statement misses the point. It is indeed true that our theory may be viewed as a programming language. In fact, many programs are naturally written as functions in our theory. But our theory is a mathematical tool for making precise assertions about the properties of discrete objects. As such, it can be used in conjunction with any of the usual program specification methods to state and prove properties of programs written in any programming language whatsoever. When we began our research into proving theorems about recursive functions {REF JACM}, {REF THESIS}, we thought of ourselves as proving theorems only about pure LISP and viewed our work as an implementation of McCarthy's {REF MCCARTHYBASIS} functional style of program analysis. However, we now also regard recursion as a natural alternative to quantification when making assertions about programs. Using recursive functions to make assertions about computer programs no more limits the programming language to one that implements recursion than using the ordinary quantifiers limits the programming language to one that implements quantification! In this book we use both the functional style and Floyd's inductive assertion style {REF FLOYD} of program specification in examples. (For the benefit of readers not familiar with the program verification literature, we briefly explain both ideas when they are first used.) .)$ .SEND ACKNOWLEDGMENTS $( We have relegated the foregoing remarks to the preface because we are not in general interested in program semantics in this book. We are interested in how one proves theorems about inductively constructed objects. Our work on induction and theorem-proving in general has been deeply influenced by that of Bledsoe {REF BLEDSOEPROVER}, {REF BLEDSOELIMIT}. Some early versions of our work have been previously reported in {REF JACM}, {REF THESIS}, {REF SIGART}, {REF IEEE}, and {REF IJCAI77}. Work closely related to our work on induction has been done by Brotz {REF BROTZ}, Aubin {REF AUBIN}, and Cartwright {REF CARTWRIGHT}. We thank Anne Boyer, Jacqueline Handley, Paul Gloess, John Laski, Greg Nelson, Richard Pattis, and Jay Spitzen for their careful criticisms of this book. We also thank Jay Spitzen for showing us how to prove the prime factorization theorem. Finally, we thank our wives and children for their usually cheerful long-suffering through the years of late hours behind this book. Our work has been supported by the National Science Foundation under Grant MCS-7681425 and by the Office of Naval Research under Contract N00014-75-C-0816. We are very grateful to these agencies for their support. .ASIS Computer Science Laboratory SRI International Menlo Park, California December, 1978 .ENDASIS .)$ .SEC |INTRODUCTION| Unlike most texts on logic and mathematics, this book is about how to prove theorems rather than the proofs of specific results. We give our answers to such questions as: .CROWN(8,0,0) When should induction be used? How does one invent an appropriate induction argument? When should a definition be expanded? .ENDBULLET We assume the reader is familiar with the mathematical notion of equality and with the logical connectives "and," "or," "not," and "implies" of propositional calculus. We present a logical theory in which one can introduce inductively constructed objects (such as the natural numbers and finite sequences) and prove theorems about them. Then we explain how we prove theorems in our theory. We illustrate our proof techniques by using them to discover proofs of many theorems. For example, we formalize a version of the propositional calculus in our theory, and, using our techniques, we formally prove the correctness of a decision procedure for that version of propositional calculus. In another example, we develop elementary number theory from axioms introducing the natural numbers and finite sequences through the prime factorization theorem. Since our theory is undecidable, our proof techniques are not perfect. But we know that they are unambiguous, well integrated, and successful on a large number of theorems because we have programmed a computer to follow our rules and have observed the program prove many interesting theorems. In fact, the proofs we describe are actually those discovered by our program. .SS |Motivation| Suppose it were practical to reason, mechanically and with mathematical certainty, about computer programs. For example, suppose it were practical to prove mechanically that a given program satisfied some specification, or exhibited the same output behavior as another program, or executed in certain time or space bounds.{FNOTE |See Manna and Waldinger {REF MW} for . a description of the many other ways that formal reasoning can be . usefully applied in computer programming.|} Then there would follow a tremendous improvement in the reliability of computer programs and a subsequent reduction of the overall cost of producing and maintaining programs. To reason mechanically about programs, one must have a formal program semantics, a formal logical theory, and a mechanical theorem prover for that theory. The study of formal program semantics has provided a variety of alternative methods for specifying and modeling programs. But all the methods have one thing in common:##they reduce the question "Does this program have the desired property?" to the question "Are these formulas theorems?" Because of the nature of computers, the formulas in question almost exclusively involve inductively constructed mathematical objects: the integers, finite sequences, n-tuples, trees, grammars, expressions, stacks, queues, buffers, etc. Thus, regardless of which program semantics we use to obtain the formulas to be proved, our formal theory and mechanical theorem prover must permit definition and proof by induction. This book is about such a theory and a mechanical theorem prover for it. .SS |Our Formal Theory| We will present a logical theory that we have tailored to the needs of thinking about computer programs. It provides for the introduction of new "types" of objects, a general principle of induction on well-founded relations (Noetherian Induction {REF BOURBAKI}), and a principle permitting the definition of recursive functions. Recursive functions offer such a powerful form of expression when dealing with discrete mathematics (such as underlies computer programs) that we do not use any additional form of quantification.{FNOTE |The program of using recursive . functions and induction to understand computer programs, and the use of . computers to aid the generation of the proofs, were . begun by McCarthy {REF MCCARTHYCHECKING}, {REF MCCARTHYBASIS}. . See also Burstall {REF BURSTALL}. . The idea of using recursive functions and induction but no . other form of quantification in the foundations . of mathematics (or at least of arithmetic) was first presented . by Skolem in 1919 {REF SKOLEM}. See also Goodstein {REF GOODSTEIN}.|} .SS |Proof Techniques| After defining our formal theory, we describe many techniques we have developed for proving theorems in it. We devote eleven {SECTIONORCHAPTER}s to the description of these techniques and how, when, and where they should be applied to prove theorems. The most important of these techniques is the use of induction. The formulation of an induction argument for a conjecture is based on an analysis of the recursive definitions of the concepts involved in the conjecture. Thus the use of recursively defined functions facilitates proving theorems about inductively defined objects. Many of the other proof techniques are designed to support our induction heuristics. .SS |Examples| All the techniques are illustrated with examples. Most of our techniques are first illustrated with simple theorems about functions on lists and trees. These elementary functions are simple to define and are worth knowing if one is interested in mechanical theorem-proving (as we assume many readers will be). In addition, it is more fun to work through the proofs of novel theorems than through the proofs of, say, the familiar theorems of elementary number theory. We have also included four complicated examples, chosen from several different subject domains, to illustrate the general applicability of the theory and our proof methods. In the first such example, we write a tautology checker as a recursive function on trees representing formulas in propositional calculus. We exercise the theory and proof techniques in an interesting way by stating and proving that the tautology checker always returns an answer, recognizes only tautologies, and recognizes all tautologies. This example serves two important purposes:##it illustrates the theory and proof techniques in use, and it gives the reader a precise definition of a simple mechanical theorem prover (i.e., the tautology checker) without requiring a digression into programming languages or computational issues. In the second major example, we prove the correctness of a simple algorithm that "optimizes" and "compiles" arithmetic expressions into sequences of instructions for a hand-held calculator. In order to specify the algorithm and the calculator, we use (and briefly explain) McCarthy's "functional semantics" {REF MCCARTHYBASIS} for programs. This example is the first part of the book that deals explicitly with computational (rather than mathematical) ideas. Because the example is simple (compared to real compilers and real hardware) and because it is ideally suited to our mathematical language, the reader unfamiliar with computing should be able to read this {SECTIONORCHAPTER} comfortably (and even learn the basic ideas behind compiling expressions and one style of program specification). In the third major example, we prove the correctness of a fast string searching algorithm. The algorithm finds the first occurrence of one character sequence in another (if such an occurrence exists), and on the average, is the fastest such algorithm currently known. In this example we explain and use a second program specification style, called the "inductive assertion" method (Floyd {REF FLOYD}). Finally, we demonstrate that the theory and proof techniques can be used to prove theorems that are generally considered difficult (rather than just theorems that have not been generally considered) by proving our statement of the unique prime factorization theorem:##(a) any positive integer can be represented as the product of a finite sequence of primes and (b) any two finite sequences of primes with the same product are in fact permutations of one another. We derive this theorem starting from the axioms of Peano arithmetic and finite sequences. .SS |Our Mechanical Theorem Prover| It is one thing to describe a loosely connected set of heuristics that a human might use to discover proofs and quite a different thing to formulate them so that a machine can use them to discover proofs.{FNOTE |For a survey of non-resolution theorem-proving, . see {REF BLEDSOESURVEY}.|} All of the heuristics described have been implemented and together comprise our automatic theorem-proving program. Our description of the heuristics makes little or no reference to the fact that they can be mechanized. However, we want competent readers to be able to reproduce and build upon our results. Thus, we are more precise than we would be had we desired only to teach a student how we prove theorems. All of the example proofs discussed in this book are actually produced by our program. We present the theorem prover's proofs in an informal style. While describing proof techniques we present small inference steps, but as we move on to the interesting examples we ascend to the level upon which humans usually deal with proofs. By the time we reach the prime factorization theorem, the proofs we describe are very much like those in number theory textbooks:##we make large inference leaps, use lemmas without describing their proofs, and dismiss whole theorems with phrases like "the proof is by induction on X." However, our high-level descriptions of the machine's proofs should not be confused with what the machine does:##before pronouncing a conjecture proved, the machine discovers a complete sequence of applications of our proof techniques establishing the conjecture from axioms and previously proved lemmas. Furthermore, given the detailed presentation of our proof techniques and their orchestration, the reader should also be able to discover the proofs mechanically. It is perhaps easiest to think of our program much as one would think of a reasonably good mathematics student:##given the axioms of Peano, he could hardly be expected to prove (much less discover) the prime factorization theorem. However, he could cope quite well if given the axioms of Peano and a list of theorems to prove (e.g., "prove that addition is commutative," ... "prove that multiplication distributes over addition," ... "prove that the result returned by the GCD function divides both of its arguments," ... "prove that if the products over two sequences of primes are identical, then the two sequences are permutations of one another"). The examples discussed are just part of a standard sequence of approximately 400 definitions and theorems the program is expected to reestablish whenever we incorporate a new technique. The sequence contains theorems some readers will have trouble proving before they read the heuristics. In {APP APPTHMS} we list the entire sequence as evidence that the heuristics are generally useful and well integrated. It should be noted that when the latter theorems of the sequence are proved the theorem prover is aware of the earlier theorems. The fact that previously proved results are remembered permits their use as lemmas in later proofs. The theorem prover would fail to prove many of its most interesting theorems in the absence of such lemmas. However, the more a theorem-proving program knows, the more difficult it becomes for it to prove theorems because the program is often tempted to consider using theorems that have no relevance to the task at hand. That our theorem-proving program does prove the entire list of theorems sequentially is a measure of its capacity to avoid being confused by what it knows. .SS |Artificial Intelligence or Logic?| While drawing heavily upon important facts of mathematical logic, our research is really more artificial intelligence than logic. The principal question we ask (and sometimes answer) is "how do we discover proofs?" It has been argued that mechanical theorem-proving is an impossible task because certain theories are known to be undecidable or super-super-exponential in complexity. Such metamathematical results are, of course, no more of an impediment to mechanical theorem-proving than to human theorem-proving.{FNOTE |Nils Nilsson, private communication.|} They only make the task more interesting. .SS |Organization| This book is structured as follows. We begin informally in {YONSEC SECOVERVIEW} by sketching our logical theory, formulating a simple conjecture, and proving that conjecture by using many of the techniques we will discuss. We present the theory formally in {YONSEC SECTHEFORMALTHEORY}. In {YONSEC SECTAUTOLOGYCHECKER} we state and prove in our theory the correctness of a function for recognizing tautologies. In {YONSEC SECPROVE} through {YONSEC SECINDUCTION} we develop and explain our heuristics for proving theorems. These heuristics are illustrated with simple proof steps taken from many theorems. In {YONSEC SECELEMENTARY}, our theorem prover develops an interesting initial segment of elementary number theory. Finally, in {YONSEC SECCOMPILER} through {YONSEC SECPRIMEFACTORIZATION} we discuss three complex examples:##the proof of correctness of a simple optimizing compiler for arithmetic expressions, the correctness of a fast string searching algorithm, and the proof of the unique prime factorization theorem. Readers interested in ascertaining the power of our automatic theorem-proving program and in seeing how recursive functions can be used to formalize a variety of problems should first read the informal overview ({YONSEC SECOVERVIEW}) and then look at the {SECTIONORCHAPTER}s on the tautology checker ({YONSEC SECTAUTOLOGYCHECKER}), compiler ({YONSEC SECCOMPILER}), string searching algorithm ({YONSEC SECFSTRPOS}), and prime factorization theorem ({YONSEC SECPRIMEFACTORIZATION}). The book has three appendixes. The first lists all the definitions and theorems the theorem prover routinely proves. The second appendix presents implementation details concerning the introduction of new types. The third appendix exhibits, in clausal form, the axioms of our theory. .SEC |A SKETCH OF THE THEORY AND TWO SIMPLE EXAMPLES|, SECOVERVIEW: To prove theorems formally one must have in mind a formal theory in which the proofs are to be constructed. We will present our formal theory in {YONSEC SECTHEFORMALTHEORY}. Following the precise presentation of the theory, we describe, in great detail, how we discover proofs. However, before descending into detail, we here sketch the theory informally, exhibit and explain several simple recursive function definitions, and (without regard for mechanization) work through several simple inductive proofs. .SS |An Informal Sketch of the Theory| .SSS |IF and EQUAL| We employ the prefix notation of Church's lambda calculus {REF CHURCHLAMBDA} and McCarthy's LISP {REF LISPMANUAL} when writing down terms. Thus, we write .ASIS (PLUS (H X) (B)) .ENDASIS when others write .ASIS PLUS(H(X),B()) .ENDASIS or even .ASIS H(X) + B() .ENDASIS to denote the application of the two-argument function PLUS to (1) the application of H to X and (2) the constant (i.e., function of no arguments) B. We wish to make it easy to define new functions and predicates on inductively constructed objects. For example, given axioms for the natural numbers we would like to define functions such as addition and multiplication, and predicates such as whether a given number is prime; given the axioms for sequences we would like to define operations such as sequence concatenation and predicates such as whether one sequence is a permutation of another. We find it most convenient to define new functions with equality axioms of the form: .ASIS (f x[&1] ... x[&n]) = body .ENDASIS where certain constraints are placed on f, the x[&i], and the term body. It is often necessary to make conditional definitions. For example, (PLUS X Y) is defined to be one thing if X=0, and another thing if X=$m-1/0. In order for definitions to be equality axioms, the right-hand side of the definition, body, must be a term. But in the usual treatment of logic it is not permitted to embed a proposition (such as X=0) in a term. Thus, we find it necessary to reproduce the logic of truth functions and equality at the term level. We add to the usual propositional calculus with variables, function symbols, and equality an axiom supposing the existence of two distinct constants, (TRUE) and (FALSE) (henceforth written T and F), and four axioms defining the new function symbols EQUAL and IF. The axioms are: .ASIS Axiom. T /$M-1= F Axiom. X = Y -> (EQUAL X Y) = T Axiom. X /$M-1= Y -> (EQUAL X Y) = F Axiom. X = F -> (IF X Y Z) = Z Axiom. X /$M-1= F -> (IF X Y Z) = Y. .ENDASIS We can paraphrase the above axioms as follows. T is not F. For all X and Y, (EQUAL X Y) is T if X is Y, and is F if X is not Y. (IF X Y Z) is Y if X is non-F and is Z otherwise. Thus (IF (EQUAL X 0) Y Z) is equal to Y if X is 0 and is equal to Z otherwise. Strictly speaking, we never define "predicates," for they can only be used in the construction of formulas and thus cannot be used in terms (such as function bodies). Without loss of generality, we restrict our attention to functions. For example, we will later define the __function_ PRIME, so that (PRIME X) is T if X is prime and F otherwise. To permit terms or functions to test logical combinations of expressions, it is convenient to define the functional versions of "not," "and," "or," and "implies." For example, we want (NOT P) to be T if P is F and to be F if P is not F. Similarly, we want (AND P Q) to be T if both P and Q are non-F, and F otherwise. Thus, we define the functions NOT, AND, OR, and IMPLIES as follows: .ASIS Definition. (NOT P) = (IF P F T) Definition. (AND P Q) = (IF P (IF Q T F) F) Definition. (OR P Q) = (IF P T (IF Q T F)) Definition. (IMPLIES P Q) = (IF P (IF Q T F) T). .ENDASIS (We adopt the notational convention of treating AND and OR as though they took an arbitrary number of arguments. For example, (AND p q r) is an abbreviation for (AND p (AND q r)).) It is easy to show that these definitions capture the semantics of the ordinary logical connectives. For example, it is a theorem that: .ASIS (P/$M-1=F -> Q/$M-1=F) <-> (IMPLIES P Q) /$M-1= F. .ENDASIS Thus, it is also easy to prove that: .ASIS (IMPLIES (AND P Q) (OR P Q)) .ENDASIS is not equal to F. Because of our emphasis on terms rather than formulas we find it convenient to call a term, p, a "theorem" if it can be proved that p/$M-1=F. Of course, calling a term a "theorem" is an abuse of terminology, since theorems are in fact understood to be formulas. However, whenever we use a term, p, as though it were a formula, it is always acceptable to read the formula "p/$M-1=F" in its place. .SSS |Inductively Constructed Objects| Any theory concerned with the mathematics behind computing must provide inductively constructed objects. For example, it is clear that we must be able to talk formally about the natural numbers, and so we will add to our theory axioms for the natural numbers. In formalizing properties of programs we have found that it is convenient to allow the introduction of "new" types of inductively constructed objects, of which the integers are just a single example. To eliminate the possibility that the axioms for a new type will render the theory inconsistent (or not fully specify the properties of the type) we have included in our theory a general principle under which one can introduce new types. We call the principle the "shell" principle. The name "shell" derives from imagining the new objects to be colored structures encapsulating a fixed number of components (possibly of certain colors). It is actually with the shell principle that we add the axioms defining the nonnegative integers. We also use the principle to add the set of literal atoms (i.e., atomic symbols such as "NIL" and "X"), and the set of ordered pairs. For example, to axiomatize the set of ordered pairs we incant: .ASIS Add the shell CONS, with recognizer LISTP, and accessors CAR and CDR that return "NIL" on non-LISTP objects, .ENDASIS which is a shorthand for adding a set of axioms that specifies (CONS X Y) to be the red, say, ordered pair containing X and Y, LISTP to be the function that returns T if its argument is a red pair and F otherwise, and CAR and CDR to be the functions that return the first and second components of red pairs (or "NIL" if given an object other than a red pair). For example, here are some of the axioms added by the above incantation: .ASIS Axiom. LISTP.CONS: (LISTP (CONS X1 X2)) Axiom. CAR.CONS: (EQUAL (CAR (CONS X1 X2)) X1) Axiom. CDR.CONS: (EQUAL (CDR (CONS X1 X2)) X2) Axiom. CAR/CDR.ELIM: (IMPLIES (LISTP X) (EQUAL (CONS (CAR X) (CDR X)) X)) Axiom. CONS.EQUAL: (EQUAL (EQUAL (CONS X1 X2) (CONS Y1 Y2)) (AND (EQUAL X1 Y1) (EQUAL X2 Y2))) Axiom. CAR.LESSP: (IMPLIES (LISTP X) (LESSP (COUNT (CAR X)) (COUNT X))) Axiom. CDR.LESSP: (IMPLIES (LISTP X) (LESSP (COUNT (CDR X)) (COUNT X))). .ENDASIS LESSP is the "less-than" function on the nonnegative integers. The complete set of axioms for the CONS shell is given schematically in {YONSEC SECTHEFORMALTHEORY}. Among the axioms not shown above are, for example, axioms specifying that (CAR X) is "NIL" if X is not a LISTP, and that the set of LISTP objects does not overlap the set of numbers, literal atoms, or other types. We use ordered pairs in a variety of ways. For example, to talk about finite sequences (sometimes called "lists") we thing of "NIL" as the empty sequence, and we think of (CONS X Y) as the sequence whose first element is X and whose remaining elements are those of Y. Thus, we think of .ASIS (CONS 1 (CONS 2 (CONS 3 "NIL"))) .ENDASIS as the sequence containing 1, 2, and 3. .SSS |Recursively Defined Functions| Our theory includes a principle of definition allowing the introduction of recursive functions. The principle is based on the notion of well-founded relations. In particular, (f x[&1] ... x[&n]) may be defined to be some term, body, involving "recursive calls" of the form (f y[&1] ... y[&n]), provided there is a measure and well-founded relation such that in every recursive call the measure of the y[&i] is smaller, according to the well-founded relation, than the measure of the x[&i]. Since "well-founded" means that there is no infinite sequence of objects, each of which is smaller than its predecessor in the sequence, the above restriction, together with a few simple syntactic requirements, ensures that there exists one and only one function satisfying the definition. The existence of a function satisfying the definition implies that adding the definition as an axiom does not render the theory inconsistent. We explicitly assume that LESSP is well-founded. Furthermore, when the CONS shell is added, the axioms CAR.LESSP and CDR.LESSP, above, inform us that the CAR and CDR of a pair both have smaller size (as measured by the function COUNT) than the pair itself. Thus, if in every recursive call of a function some particular argument is replaced by its own CAR or CDR, and we can establish that in each such case that argument is a pair, then the principle of definition would admit the function. To illustrate a definition by recursion, suppose we wished to concatenate two finite sequences X and Y. If X is empty, then the result of concatenating X and Y is just Y. If X is nonempty, then X has a first element, (CAR X), and some remaining elements, (CDR X). The concatenation of X and Y in this case is the sequence whose first element is (CAR X) and whose remaining elements are those in the concatenation of (CDR X) and Y. Formally, we define the function APPEND so that (APPEND X Y) is the concatenation of X and Y: .ASIS Definition. (APPEND X Y) = (IF (LISTP X) (CONS (CAR X) (APPEND (CDR X) Y)) Y). .ENDASIS APPEND is a particularly simple recursive function. It is easy to see why it is accepted under our principle of definition:# the axiom CDR.LESSP, above, establishes that (COUNT X) gets LESSP-smaller in each (i.e., the) recursive call. Later in the book we will introduce more interesting recursive functions -- functions for which a measure as obvious as the size of one argument will not suffice to justify their definition. By the axioms of equality, we can replace any instance of (APPEND X Y) with the corresponding instance of the right-hand side of the definition above. For example, we can show that (APPEND (CONS A D) Y) is equal to (CONS A (APPEND D Y)) as follows. By definition, (APPEND (CONS A D) Y) is equal to the instantiated body: .ASIS (IF (LISTP (CONS A D)) (CONS (CAR (CONS A D)) (APPEND (CDR (CONS A D)) Y)) Y). .ENDASIS By the axiom LISTP.CONS, above, (LISTP (CONS A D)) is non-F. But, by the axioms defining IF, we know that when the first argument in an IF-expression is non-F, the IF-expression is equal to its second argument. Thus, the above IF-expression can be replaced by its second argument: .ASIS (CONS (CAR (CONS A D)) (APPEND (CDR (CONS A D)) Y)). .ENDASIS Finally, applying the axioms CAR.CONS and CDR.CONS, above, we rewrite (CAR (CONS A D)) to A, and (CDR (CONS A D)) to D, obtaining: .ASIS (CONS A (APPEND D Y)) .ENDASIS as desired. Similarly, we can prove that (APPEND "NIL" Y) is Y. We can use a series of such simplifications to "compute" the result of concatenating the sequence containing 1, 2, and 3, to the sequence containing 4 and 5: .ASIS (APPEND (CONS 1 (CONS 2 (CONS 3 "NIL"))) (CONS 4 (CONS 5 "NIL"))) = (CONS 1 (APPEND (CONS 2 (CONS 3 "NIL")) (CONS 4 (CONS 5 "NIL")))) = (CONS 1 (CONS 2 (APPEND (CONS 3 "NIL") (CONS 4 (CONS 5 "NIL"))))) = (CONS 1 (CONS 2 (CONS 3 (APPEND "NIL" (CONS 4 (CONS 5 "NIL")))))) = (CONS 1 (CONS 2 (CONS 3 (CONS 4 (CONS 5 "NIL"))))). .ENDASIS Using recursion, we can introduce, under our principle of definition, almost all the concepts in which we are interested. Indeed, recursion is a very important tool when dealing with inductively constructed objects such as the integers or sequences. For instance, recursion can be regarded as a form of quantification:##we can use recursion to check that all or some elements of a sequence have some property. We do not use any other form of quantification in our formal theory. .SSS |Induction| Because we are interested in proving theorems about inductively constructed objects such as the natural numbers, sequences, pairs, etc., we need a rule of inference that embodies the construction process itself. For example, we know that if X is a pair, then it can be constructed by applying CONS to two "previously" constructed objects, namely (CAR X) and (CDR X). Thus, we can prove that some property holds for all X by considering two cases. The first case, called the "base case," is to prove that all nonpairs have the property in question. The second case, called the "induction step," is to assume that X is a pair and that (CAR X) and (CDR X) have the desired property, and to prove that X has the property. Such a proof is called a proof by "induction." The magic idea behind induction, the idea that made it appear unsound to both authors when they first encountered the idea in high school, is that one gets to assume instances of the conjecture being proved during its own proof. Why is induction sound? For example, why can we conclude, after proving the two cases above, that any X must have the property? Suppose some object does not have the property. Then let X be a minimal object not having the property, where we compare two such objects by comparing their COUNTs using the LESSP function. There __is_ a minimal object not having the property in question since, by the well-foundedness of LESSP, there is no infinite sequence of objects, each of which has smaller COUNT than the previous one. X must be a pair, because otherwise the base case establishes that X has the property. But if X is a pair, (CAR X) and (CDR X) are both smaller than X (as measured by COUNT and LESSP). Therefore, since everything smaller than X has the property in question, (CAR X) and (CDR X) have the property. But in that case the induction step establishes that X must have it also. Thus the assumption that some object does not have the property has lead to a contradiction. In general, the induction principle in our theory permits one to assume arbitrary instances of the conjecture being proved, provided those instantiations decrease some measure in some well-founded ordering. Because our induction principle allows the use of arbitrary measures and well-founded relations, we can make inductive arguments that are much more subtle than the "structural induction" illustrated above.{FNOTE |The use of structural induction . to prove programs correct is beautifully described by Burstall in {REF BURSTALL}.|} We will illustrate more subtle inductions later in the book. Throughout this {SECTIONORCHAPTER} we confine ourselves to structural induction. .SS |A Simple Inductive Proof| One of the hardest problems in discovering an inductive proof is discovering an appropriate application of the principle of induction itself. But the similarity between recursion and induction offers an insight into the problem. For example, suppose we were trying to prove some conjecture involving the expression (APPEND A term). When APPEND was introduced into the theory we were required to exhibit a measure of its arguments that was decreasing. In particular, every time APPEND recurses, the COUNT of its first argument goes down. Thus (APPEND A term) "suggests" an induction:##were we to apply the definition of APPEND to "open up" (APPEND A term) we would find ourselves recursively decomposing A into its constituents and would want information about those constituents. But by the observation above, we know those constituents are smaller than A in some well-founded order. Thus, by the induction principle, we can supply ourselves with inductive instances about those constituents. We illustrate the above reasoning with a simple example. We will prove: .ASIS Theorem. ASSOCIATIVITY.OF.APPEND: (EQUAL (APPEND (APPEND A B) C) (APPEND A (APPEND B C))). .ENDASIS .BEGIN .SINGLE SPACE Name the conjecture *1. The proof is by induction. Three terms -- each APPEND expression with a variable in its first argument -- suggest "plausible" inductions. Two of these inductions are on A and the third is on B. All occurrences of A are in argument positions being recursively decomposed. Thus, by appropriately opening up APPEND expressions we can reexpress *1 in terms of (CDR A), about which we can supply an inductive hypothesis. (In this case we say that the induction on A is "unflawed.") The induction on B is flawed: B occurs sometimes as the first argument to APPEND and sometimes as the second (which is never changed in recursion). No matter how we expand APPEND expressions, the conjecture will still involve B and (CDR B), and we are unable to supply an induction hypothesis about B while inducting on B. Thus, we will induct on A, using the following scheme: .END .ASIS (AND (IMPLIES (NOT (LISTP A)) (p A B C)) (IMPLIES (AND (LISTP A) (p (CDR A) B C)) (p A B C))). .ENDASIS .BEGIN .SINGLE SPACE That is, letting (p A B C) be a schematic representation of *1, we will prove that (p A B C) holds when A is not a LISTP, and we will prove that if A is a LISTP and (p (CDR A) B C) holds, then (p A B C) holds. The induction is sound because the axiom CDR.LESSP establishes that (COUNT A) decreases according to the well-founded relation LESSP in the induction step of the scheme. Instantiating the scheme above with *1 we obtain two new goals. We block indent the proofs of the two goals: .END .ASIS Case 1. (IMPLIES (NOT (LISTP A)) (EQUAL (APPEND (APPEND A B) C) (APPEND A (APPEND B C)))). This is the base case. Since A is non-LISTP, (APPEND A B) is equal to its second argument, B, by the definition of APPEND. Similarly, (APPEND A (APPEND B C)) is equal to its second argument, (APPEND B C). By rewriting these two terms in the conclusion above we obtain: (IMPLIES (NOT (LISTP A)) (EQUAL (APPEND B C) (APPEND B C))), which simplifies, using the axiom X=Y -> (EQUAL X Y)=T, to: (TRUE). Case 2. (IMPLIES (AND (LISTP A) (EQUAL (APPEND (APPEND (CDR A) B) C) (APPEND (CDR A) (APPEND B C)))) (EQUAL (APPEND (APPEND A B) C) (APPEND A (APPEND B C)))). This is the induction step. Since A is a LISTP here, (APPEND A Y), for any Y, is equal to (CONS (CAR A) (APPEND (CDR A) Y)), by the definition of APPEND. Thus, we can "unfold" the two (APPEND A term) expressions in the conclusion above to get: (IMPLIES (AND (LISTP A) (EQUAL (APPEND (APPEND (CDR A) B) C) (APPEND (CDR A) (APPEND B C)))) (EQUAL (APPEND (CONS (CAR A) (APPEND (CDR A) B)) C) (CONS (CAR A) (APPEND (CDR A) (APPEND B C))))). But, by opening up the definition of APPEND, we know that (APPEND (CONS A D) Y) is equal to (CONS A (APPEND D Y)). Thus, we can expand the first APPEND term in the conclusion above, to get: (IMPLIES (AND (LISTP A) (EQUAL (APPEND (APPEND (CDR A) B) C) (APPEND (CDR A) (APPEND B C)))) (EQUAL (CONS (CAR A) (APPEND (APPEND (CDR A) B) C)) (CONS (CAR A) (APPEND (CDR A) (APPEND B C))))). Note that the conclusion above is of the form: (EQUAL (CONS x y) (CONS u v)). But according to CONS.EQUAL, two pairs are equal if and only if the components are pairwise equal. That is, the concluding equality may be rewritten to the conjunction of (EQUAL x u) and (EQUAL y v). But in the above application x and u are identical -- they are both (CAR A). Thus, we replace the concluding equality above with (EQUAL y v): (IMPLIES (AND (LISTP A) (EQUAL (APPEND (APPEND (CDR A) B) C) (APPEND (CDR A) (APPEND B C)))) (EQUAL (APPEND (APPEND (CDR A) B) C) (APPEND (CDR A) (APPEND B C)))). However, this simplifies to: (TRUE), because the conclusion is identical to the second hypothesis. In particular, by opening up the correct APPEND expressions we transformed the induction conclusion into the induction hypothesis. That finishes the proof of *1. Q.E.D. .ENDASIS .SS |A More Difficult Problem| The proof of the associativity of APPEND illustrates two of the proof techniques we describe later:##induction and simplification. Some of the other heuristics we will describe are: .CROWN(8,0,0) It is sometimes useful to trade "bad" terms for "good" ones by re-representing terms in the conjecture. For example, one might transform a conjecture about I and I-J to one about K+J and K, trading difference for addition, by replacing all occurrences of I by K+J. One obvious way to use an equality hypothesis, (EQUAL x y), is to substitute x for y throughout the conjecture. But it is sometimes useful to replace only some of the y's by x's and then to "throw away" the equality hypothesis, so as to produce a more general conjecture to prove by induction. We call such use of an equality hypothesis "cross-fertilization." In proofs by induction, it is easier to prove strong theorems than weak ones, because strong theorems permit one to obtain strong induction hypotheses with which to work. Thus, one should look for ways to generalize a conjecture to be proved by induction. .ENDCROWN We illustrate these proof techniques by working through another simple example. Consider the idea of the "fringe" of a tree of CONSes. Given the tree: .ASIS * / \ * 3 / \ 1 2 .ENDASIS that is, (CONS (CONS 1 2) 3), we wish to return the sequence of tips of the tree, 1, 2, 3, that is, (CONS 1 (CONS 2 (CONS 3 "NIL"))). An obvious way to "flatten" a tree, X, is to ask first whether (LISTP X) is true. If so, X is a fork in the tree. The fringe of a fork is the concatenation (as with APPEND) of the fringes of the left and right subtrees (i.e., the recursively obtained fringes of (CAR X) and (CDR X)). If (LISTP X) is false, X is a tip of the tree. The fringe of a tip is the singleton sequence containing the tip, i.e., (CONS X "NIL").{FNOTE |Computing the fringe of binary trees . brings to mind one of the lesser chestnuts of Artificial Intelligence: . how to compute that two binary trees have the same fringe . without using much storage (in particular, without simply computing . both fringes and comparing them). We believe the original formulation of . the problem was by Carl Hewitt. Burstall and Darlington {REF BURSTALLTRANS} have investigated . the solution of this problem by automatic program transformations. . McCarthy (private communication) has written a recursive function, . exhibited as SAMEFRINGE in {APP APPTHMS}, that is more or less the . recursive realization of the usual "coroutining" solution . to the problem. Our system can prove the correctness of SAMEFRINGE, and we refer . the interested reader to {APP APPTHMS}. The problem of computing . the fringe of a tree relates to . computer science as a whole, since similar tree processing underlies . such fundamental algorithms as parsers and compilers.|} The above description can be immediately transcribed into the definition of a recursive function that we will call FLATTEN. We exhibit the definition of FLATTEN later, when we formalize the problem. Considered as a recipe for computing the fringe of a tree, FLATTEN is somewhat inefficient. It visits every fork and tip of the tree once, but for every fork, the concatenation process revisits every tip of the left-hand branch. In trees heavily nested to the left, FLATTEN computes in time n^^2^, where n is the number of tips. John McCarthy (private communication) suggested the following more efficient algorithm, which computes in linear time and which we call MC.FLATTEN. The basic idea is that to collect all the tips in a tree one can initialize a collection site to "NIL", and then sweep the tree adding tips to the site as they are encountered. If the tips are added to the front of the collection site, the answer will be in exactly the reverse of the order in which the tips were visited. If we sweep the tree from right to left (instead of left to right), the result will be in the order desired. To write the algorithm as a recursive function, we use a second argument, ANS, as the collection site. At a tip, we add the tip to the front of ANS (with CONS) and return the new list as the answer. At a fork, we first collect the tips in the CDR (the right branch), and, using the resulting answer as the collection site, we then collect the tips in the CAR (the left branch). Thus, (MC.FLATTEN X ANS) should append the fringe of X onto ANS. The formal statement of the relationship between MC.FLATTEN and FLATTEN is: .ASIS (EQUAL (MC.FLATTEN X ANS) (APPEND (FLATTEN X) ANS)). .ENDASIS In the next {SUBSECTIONORSECTION} we define FLATTEN and MC.FLATTEN formally, explain why they are admitted under our principle of definition, and then work through the proof of the conjecture above. .SS |A More Difficult Proof| .ASIS Definition. (FLATTEN X) = (IF (LISTP X) (APPEND (FLATTEN (CAR X)) (FLATTEN (CDR X))) (CONS X "NIL")) The lemmas CAR.LESSP and CDR.LESSP establish that (COUNT X) goes down according to the well-founded relation LESSP in each recursive call. Hence, FLATTEN is accepted under the definition principle. Observe that (LISTP (FLATTEN X)) is a theorem. Definition. (MC.FLATTEN X ANS) = (IF (LISTP X) (MC.FLATTEN (CAR X) (MC.FLATTEN (CDR X) ANS)) (CONS X ANS)) The lemmas CDR.LESSP and CAR.LESSP establish that (COUNT X) decreases according to the well-founded relation LESSP in each recursive call. Hence, MC.FLATTEN is accepted under the definition principle. Note that (LISTP (MC.FLATTEN X ANS)) is a theorem. Theorem. FLATTEN.MC.FLATTEN: (EQUAL (MC.FLATTEN X ANS) (APPEND (FLATTEN X) ANS)) Name the conjecture *1. Let us appeal to the induction principle. There are two plausible inductions. However, they merge into one likely candidate induction. We will induct according to the following scheme: (AND (IMPLIES (NOT (LISTP X)) (p X ANS)) (IMPLIES (AND (LISTP X) (p (CAR X) (MC.FLATTEN (CDR X) ANS)) (p (CDR X) ANS)) (p X ANS))). The inequalities CAR.LESSP and CDR.LESSP establish that the measure (COUNT X) decreases according to the well-founded relation LESSP in the induction step of the scheme. Note, however, the inductive instances chosen for ANS. The above induction scheme produces two new formulas: Case 1. (IMPLIES (NOT (LISTP X)) (EQUAL (MC.FLATTEN X ANS) (APPEND (FLATTEN X) ANS))), which simplifies, applying CDR.CONS and CAR.CONS, and expanding the definitions of MC.FLATTEN, FLATTEN and APPEND, to: (TRUE). Case 2. (IMPLIES (AND (LISTP X) (EQUAL (MC.FLATTEN (CAR X) (MC.FLATTEN (CDR X) ANS)) (APPEND (FLATTEN (CAR X)) (MC.FLATTEN (CDR X) ANS))) (EQUAL (MC.FLATTEN (CDR X) ANS) (APPEND (FLATTEN (CDR X)) ANS))) (EQUAL (MC.FLATTEN X ANS) (APPEND (FLATTEN X) ANS))). This simplifies, unfolding the definitions of MC.FLATTEN and FLATTEN, to: (IMPLIES (AND (LISTP X) (EQUAL (MC.FLATTEN (CAR X) (MC.FLATTEN (CDR X) ANS)) (APPEND (FLATTEN (CAR X)) (MC.FLATTEN (CDR X) ANS))) (EQUAL (MC.FLATTEN (CDR X) ANS) (APPEND (FLATTEN (CDR X)) ANS))) (EQUAL (MC.FLATTEN (CAR X) (MC.FLATTEN (CDR X) ANS)) (APPEND (APPEND (FLATTEN (CAR X)) (FLATTEN (CDR X))) ANS))). Appealing to the lemma CAR/CDR.ELIM, we now replace X by (CONS Z V) to eliminate (CAR X) and (CDR X). This generates: (IMPLIES (AND (LISTP (CONS Z V)) (EQUAL (MC.FLATTEN Z (MC.FLATTEN V ANS)) (APPEND (FLATTEN Z) (MC.FLATTEN V ANS))) (EQUAL (MC.FLATTEN V ANS) (APPEND (FLATTEN V) ANS))) (EQUAL (MC.FLATTEN Z (MC.FLATTEN V ANS)) (APPEND (APPEND (FLATTEN Z) (FLATTEN V)) ANS))), which further simplifies, clearly, to: (IMPLIES (AND (EQUAL (MC.FLATTEN Z (MC.FLATTEN V ANS)) (APPEND (FLATTEN Z) (MC.FLATTEN V ANS))) (EQUAL (MC.FLATTEN V ANS) (APPEND (FLATTEN V) ANS))) (EQUAL (MC.FLATTEN Z (MC.FLATTEN V ANS)) (APPEND (APPEND (FLATTEN Z) (FLATTEN V)) ANS))). We use the first equality hypothesis by cross-fertilizing: (APPEND (FLATTEN Z) (MC.FLATTEN V ANS)) for (MC.FLATTEN Z (MC.FLATTEN V ANS)) and throwing away the equality. This generates: (IMPLIES (EQUAL (MC.FLATTEN V ANS) (APPEND (FLATTEN V) ANS)) (EQUAL (APPEND (FLATTEN Z) (MC.FLATTEN V ANS)) (APPEND (APPEND (FLATTEN Z) (FLATTEN V)) ANS))). We now use the above equality hypothesis by cross-fertilizing (APPEND (FLATTEN V) ANS) for (MC.FLATTEN V ANS) and throwing away the equality. We thus obtain: (EQUAL (APPEND (FLATTEN Z) (APPEND (FLATTEN V) ANS)) (APPEND (APPEND (FLATTEN Z) (FLATTEN V)) ANS)), which we generalize by replacing (FLATTEN V) by Y and (FLATTEN Z) by A. We restrict the new variables by appealing to the type restriction lemma noted when FLATTEN was introduced. The result is: (IMPLIES (AND (LISTP Y) (LISTP A)) (EQUAL (APPEND A (APPEND Y ANS)) (APPEND (APPEND A Y) ANS))), which we will finally name *1.1. Let us appeal to the induction principle. The recursive terms in the conjecture suggest three inductions. They merge into two likely candidate inductions. However, only one is unflawed. We will induct according to the following scheme: (AND (IMPLIES (NOT (LISTP A)) (p A Y ANS)) (IMPLIES (AND (LISTP A) (p (CDR A) Y ANS)) (p A Y ANS))). The inequality CDR.LESSP establishes that the measure (COUNT A) decreases according to the well-founded relation LESSP in the induction step of the scheme. The above induction scheme generates two new conjectures: Case 1. (IMPLIES (AND (NOT (LISTP (CDR A))) (LISTP Y) (LISTP A)) (EQUAL (APPEND A (APPEND Y ANS)) (APPEND (APPEND A Y) ANS))). This simplifies, appealing to the lemmas CDR.CONS, CAR.CONS and CONS.EQUAL, and expanding APPEND, to: (IMPLIES (AND (NOT (LISTP (CDR A))) (LISTP Y) (LISTP A)) (EQUAL (APPEND (CDR A) (APPEND Y ANS)) (APPEND (APPEND (CDR A) Y) ANS))). However this again simplifies, unfolding APPEND, to: (TRUE). Case 2. (IMPLIES (AND (EQUAL (APPEND (CDR A) (APPEND Y ANS)) (APPEND (APPEND (CDR A) Y) ANS)) (LISTP Y) (LISTP A)) (EQUAL (APPEND A (APPEND Y ANS)) (APPEND (APPEND A Y) ANS))). This simplifies, applying the lemmas CDR.CONS, CAR.CONS and CONS.EQUAL, and expanding the definition of APPEND, to: (TRUE). That finishes the proof of *1.1, which, consequently, finishes the proof of *1. Q.E.D. .ENDASIS .SS |Summary| The purpose of this {SECTIONORCHAPTER} was to provide an introduction to our function-based theory and to indicate how we prove theorems in the theory. As noted, all our proof techniques have been implemented in an automatic theorem-proving program. In fact, the last {SUBSECTIONORSECTION} was written, in its entirety, by our automatic theorem prover in response to three user commands supplying the definitions of FLATTEN and MC.FLATTEN and the statement of the theorem to be proved. This book is about such questions as how function definitions are analyzed by our theorem-proving system to establish their admissibility, how the system discovers that (LISTP (FLATTEN X)) is a theorem when presented with the definition of FLATTEN, why the system chooses the inductions it does, and why some functions are expanded and others are not. We describe our proof techniques in detail after presenting a precise statement of our formal theory. .SS |Notes| To illustrate several proof techniques, we instructed the theorem prover to conduct the FLATTEN.MC.FLATTEN proof in an environment in which it was aware only of the axioms of our basic theory and the definitions of APPEND, FLATTEN, and MC.FLATTEN. In the proof, the program derived a version of the associativity of APPEND (formula *1.1) and proved it with a second induction. Had the theorem prover previously proved the associativity of APPEND (and been instructed to remember it), the proof above would have been much shorter, as the lemma would have been involved in early simplifications. Later in the book, when we deal with more complicated examples such as the system's proof of the Fundamental Theorem of Arithmetic, we will show the system working primarily from previously proved theorems rather than axioms. The total amount of CPU time required to analyze the two definitions above and produce the proof is five-and-a-half seconds running compiled INTERLISP {REF TEITELMAN}, {REF VM} on a Digital Equipment Corporation KL-10. For an introduction to LISP see Allen's {REF JOHNALLEN}. The English commentary produced during the definition-time analysis of functions and during proofs is typical of the system's output. Examples will be found throughout the book. The steps in a proof are described in real-time, as the proof is developed, so that the user can follow the theorem prover's progress. To avoid boring the reader with repetitious phrasing, the system varies its sentence structure. .SEC |A PRECISE DEFINITION OF THE THEORY|, SECTHEFORMALTHEORY: In this {SECTIONORCHAPTER}, we precisely define the mathematical theory underlying our proof techniques, the theory in which our theorem prover proves theorems. We will present the axioms and principles of definition and induction to which the system appeals. We will not present the usual elementary rules of logic and equality, but instead we assume the reader is familiar with these rules. The parts of this chapter that are important are set off in boxes. The remainder of the text is motivation for the boxed material. In this motivational material we shall speak in a language of "naive" set theory. It is possible to embed our theory within a theory of sets and to derive our "principles" therein. However, we do not regard set theory or even quantification as being a part of our theory. The proofs our system produces depend only upon (a) the propositional calculus with variables and function symbols, (b) equality reasoning, (c) the rule of instantiation which permits us to infer that any instance of a theorem is a theorem, and (d) the boxed material that follows. .SS |Syntax| .STARRS We will use uppercase words, sometimes with embedded periods, hyphens, slashes, or digits, as variable symbols and function symbols. Examples are X, X1, PLUS, and PRIME.FACTORS. Associated with each function symbol is a nonnegative integer, the number of arguments the function symbol expects. The number of arguments associated with certain function symbols will become apparent as we proceed. A __term_ is a variable or a sequence consisting of a function symbol of n arguments followed by n terms. If the term t is not a variable and begins with the function symbol f, we say that t is a __call of_ f. We depart from the usual notation of F(X,Y) for function application and will instead write (F X Y). Examples of terms are thus:##X, (TRUE), and (P (ADD1 X) Y). The first term is just a variable, the second is the application of the 0-ary function symbol TRUE to no arguments (and hence denotes a constant), and the third is the application of the dyadic function symbol P to the term (ADD1 X) and the variable Y. To talk about terms, it is convenient to use so-called "metavariables" that are understood by the reader to stand for certain variables, function symbols, or terms. We will use only lower case words as metavariables, and we will make clear what type of syntactic object the symbol is to denote. For example, if f denotes the function symbol G, and t denotes the term (ADD1 Y), then (f t X) denotes the term (G (ADD1 Y) X). When we are speaking in naive set theory we use both upper and lower case words as variables ranging over numbers, sets, functions, etc. Context will make clear the range of these variables. We imagine that axioms, such as function definitions, are added as "time goes by." Whenever we add a new shell or function definition, we insist that certain function symbols not have been mentioned in any previous axiom. We call a function symbol __new_ until an axiom mentioning the function symbol has been added. If i is an integer, then by an abuse of notation we let Xi denote the variable whose first character is X and whose other characters are the decimal representation of i. Thus, if i is 4, Xi is the variable X4. A finite set s of ordered pairs is said to be a __substitution_ provided that for each ordered pair in s, v is a variable, t is a term and no other member of s has v as its first component. The __result of substituting_ a substitution s __into_ a term p (denoted p/s) is the term obtained by simultaneously replacing, for each in s, each occurrence of v as a variable in p with t. .STARRS .SS |The Theory of IF and EQUAL| .STARRS We find it necessary to reproduce the logic of truth functional propositions and equality at the term level. We assume the existence of two distinguished constants, (TRUE) and (FALSE). We use T and F as abbreviations for (TRUE) and (FALSE), respectively. We never use T or F as a variable. We axiomatize below the function EQUAL, of two arguments, to return T or F, depending on whether its two arguments are equal. We also axiomatize the function IF, of three arguments, to return its third argument if the first is F and otherwise return its second argument. .ASIS Axiom. T /$M-1= F Axiom. X = Y -> (EQUAL X Y) = T Axiom. X /$M-1= Y -> (EQUAL X Y) = F Axiom. X = F -> (IF X Y Z) = Z Axiom. X /$M-1= F -> (IF X Y Z) = Y. .ENDASIS The logical functions are defined with the following equations: .ASIS Definition. (NOT P) = (IF P F T) Definition. (AND P Q) = (IF P (IF Q T F) F) Definition. (OR P Q) = (IF P T (IF Q T F)) Definition. (IMPLIES P Q) = (IF P (IF Q T F) T). .ENDASIS We adopt the notational convention of writing (AND a b c) for (AND a (AND b c)), (AND a b c d) for (AND a (AND b (AND c d))), and so on. We make the same convention for OR. We also adopt the notational convention of sometimes writing a term where a formula is expected (e.g., we may refer to the "theorem" p, where p is a term). When we write a term p where a formula is expected, it is an abbreviation for the formula p/$M-1=F. If a term p is a theorem, then by the rule of instantiation, the result of substituting any substitution into p is a theorem. .STARRS .SS |Well-founded Relations| In the following {SUBSECTIONORSECTION}s we state a principle of induction, introduce inductively constructed objects such as the natural numbers and ordered pairs, and state a principle of definition for recursive functions. All of these extensions hinge on the idea of a "well-founded relation." A function r of two arguments is said to be a __well-founded relation_ if there is no infinite sequence x[&1], x[&2], x[&3], ... with the property that (r x[&i+1] x[&i])/$m-1=F for all integers i greater than 0. For example, suppose that (L x y) is T if x and y are nonnegative integers and x is less than y, and that (L x y) is F otherwise. Then L is a well-founded relation because there is no infinite sequence of nonnegative integers with the property that each successive integer is less than the previous one. That is, there is no infinite sequence x[&1], x[&2], x[&3], x[&4], ... such that .ASIS ... x[&4] < x[&3] < x[&2] < x[&1]. .ENDASIS On the other hand, suppose that (LE x y) is T if x and y are nonnegative integers and x <$M-1_ y. LE is not a well-founded relation because the infinite sequence 1, 1, 1, ... has the property that: .ASIS ... 1 <$M-1_ 1 <$M-1_ 1 <$M-1_ 1. .ENDASIS If r is a well-founded relation and (r x y) holds, we say that x is r__-smaller than_ y. For the purposes of our theory, functions are known to be well-founded only by assumption in one of the following three ways: .LIST(8,0,0) .ITEM Whenever we axiomatize an inductively generated type of object, e.g., the integers, we explicitly assume a certain new function to be a well-founded relation. Such an assumption is inherent in any axiomatization of an inductively generated type. See {YONSS SSNEWDATATYPES}. .ITEM We assume explicitly that the function LESSP is a well-founded relation in {YONSS SSLESSP}. We present there an informal proof that LESSP is well-founded. .ITEM Whenever we have two previously assumed well-founded relations, we assume that the lexicographic relation induced by them is well-founded. In {YONSS SSLEX} we define "induced" and present an informal proof of the well-foundedness of induced lexicographic relations. .ENDLIST The fact that a function has been assumed to be a well-founded relation is used only in our principles of induction and definition and in the formation of induced lexicographic relations. It is possible to define formally in a theory of sets (for example, see Morse {REF MORSE} or Kelley {REF KELLEY}) the concept of well-founded relation, to prove that certain relations are well-founded, and to derive as metatheorems our principles of induction and definition. However, such a development is not within the scope of this work. We say that x is an r-__minimal element of_ S provided x is a member of S and no member of S is r-smaller than x. Later in this {SECTIONORCHAPTER} we use the fact that if r is well-founded, then for each nonempty set S, there exists an r-minimal element of S. Proof. Suppose that r is well-founded and S is a nonempty set with no r-minimal element. Let f be a choice function on the power set of S. That is, suppose that for each nonempty subset s of S that (f s) is a member of s. Define the sequence x[&1], x[&2], x[&3], ... by letting x[&1] be (f S) and by letting x[&i+1] be (f s[&i]), where s[&i] is the set of all z in S r-smaller than x[&i]. For each i, s[&i+1] is nonempty (otherwise, x[&i] would be r-minimal). And for each i, x[&i+1] is r-smaller than x[&i]. Q.E.D. .SS |Induction| A rough sketch of our principle of induction is: .CROWN(8,0,0) Suppose that r denotes a well-founded relation, x is a variable, d is a function symbol, q is a term and (IMPLIES q (r (d x) x)) is a theorem. Then, to prove p it is sufficient to prove the following two things: .CROWN(8,0,0) (base case): (IMPLIES (NOT q) p), and (induction step): (IMPLIES (AND q p') p), where p' is the result of substituting (d x) for x in p. .ENDCROWN .ENDCROWN This is a version of the Generalized Principle of Induction or Noetherian Induction (see Bourbaki {REF BOURBAKI} and Burstall {REF BURSTALL}). The induction principle we actually use generalizes the principle sketched above in three ways:## .CROWN(8,0,0) Instead of limiting the principle to one variable that is getting r-smaller, we use an n-tuple x[&1], ..., x[&n] of variables and some function m such that (m x[&1] ... x[&n]) is getting r-smaller. The function m is called a "measure function." Instead of case splitting on q, we consider k+1 cases, of which one is a base case and the remaining k are induction steps. We permit each of the k induction steps to have several induction hypotheses. .ENDCROWN .STARRS .BEGIN .CENTER The Induction Principle .END .ASIS Suppose: (a) p is a term; (b) r is a function symbol that denotes a well-founded relation; (c) m is a function symbol of n arguments; (d) x[&1], ..., x[&n] are distinct variables; (e) q[&1], ..., q[&k] are terms; (f) h[&1], ..., h[&k] are positive integers; and (g) for 1<$M-1_i<$M-1_k and 1<$M-1_j<$M-1_h[&i], s[&i,j] is a substitution and it is a theorem that: (IMPLIES q[&i] (r (m x[&1] ... x[&n])/s[&i,j] (m x[&1] ... x[&n]))). Then p is a theorem if (IMPLIES (AND (NOT q[&1]) ... (NOT q[&k])) p) is a theorem and for each 1<$M-1_i<$M-1_k, (IMPLIES (AND q[&i] p/s[&i,1] ... p/s[&i,h{SUB "i"}]) p) is a theorem. .ENDASIS .STARRS Note in particular that we have to prove k+1 things (the "base case" and k "induction steps"). Each induction step distinguishes a given case with one of the q[&i]'s and provides h[&i] inductive instances of the conjecture being proved. We now illustrate an application of the induction principle. Imagine that LESSP is well-founded, that the axioms CAR.LESSP and CDR.LESSP have been added, and that FLATTEN and MC.FLATTEN have been introduced as in {YONSEC SECOVERVIEW}. The first induction performed in the proof of the FLATTEN.MC.FLATTEN theorem of {YONSEC SECOVERVIEW} is obtained by the following instance of our induction principle. p is the term (EQUAL (MC.FLATTEN X ANS) (APPEND (FLATTEN X) ANS)); r is LESSP; m is COUNT; n is 1; x[&1] is X; k is 1; q[&1] is the term (LISTP X); h[&1] is 2; s[&1,1] is $a{,}; s[&1,2] is $a{,}; the axioms CAR.LESSP and CDR.LESSP establish the two theorems required by condition (g). The base case and the induction steps produced by this application of the induction principle are those exhibited in {YONSEC SECOVERVIEW}. We now prove that our induction principle is sound. Suppose we have in mind particular p, r, m, x[&i], q[&i], h[&i], and s[&i,j] satisfying conditions (a) through (g) above, and suppose the base case and induction steps are theorems. Below is a set theoretic proof of p. Proof. Without loss of generality we assume that the x[&i] are X1, X2, ..., Xn; that r is R; that m is M; that Xn+1, Xn+2, ..., Xz are all of the variables other than X1, X2, ..., Xn in p, the q[&i], and either component of any pair in any s[&i,j]; that p is (P X1 ... Xz); that q[&i] is (Qi X1 ... Xz); and that s[&i,j] replaces X[&v], 1<$m-1_v<$m-1_z, with some term d[&i,j,v]. Let RM be the dyadic function on z-tuples defined by (RM ) = (R (M U1 ... Uz) (M V1 ... Vz)). Note that RM is well-founded. If p is not a theorem there must exist a z-tuple such that (P X1 ... Xz)=F. Let be an RM-minimal such z-tuple. We now consider the cases on which, if any, of the q[&i] are true on the chosen z-tuple. Case 1: Suppose no q[&i] is true, i.e., suppose (Q1 X1 ... Xz)=F, (Q2 X1 ... Xz)=F, ..., and (Qk X1 ... Xz)=F. Then by the base case (P X1 ... Xz)/$m-1=F, contradicting the assumption that (P X1 ... Xz)=F. Case 2: Suppose that at least one of the q[&i] is true. Without loss of generality we can assume that (Q1 X1 ... Xz)/$m-1=F. By condition (g) above we have: .ASIS (R (M d[&1,1,1] ... d[&1,1,n]) (M X1 ... Xn)), (R (M d[&1,2,1] ... d[&1,2,n]) (M X1 ... Xn)), ..., and (R (M d[&1,h{SUB "1"},1] ... d[&1,h{SUB "1"},n]) (M X1 ... Xn)). .ENDASIS Thus, by the definition of RM, we have: .ASIS (RM ), (RM ), ..., and (RM ). .ENDASIS Since is an RM-minimal z-tuple such that (P X1 ... Xz)=F, we have: .ASIS (P d[&1,1,1] ... d[&1,1,z])/$m-1=F, (P d[&1,2,1] ... d[&1,2,z])/$m-1=F, ..., and (P d[&1,h{SUB "1"},1] ... d[&1,h{SUB "1"},z])/$m-1=F. .ENDASIS Hence, by the first induction step, we derive (P X1 ... Xz)/$m-1=F, contradicting the assumption that (P X1 ... Xz)=F. Q.E.D. .SS |Shells|, SSNEWDATATYPES: Thus far the theory is somewhat impoverished in that it does not have any "interesting" objects. It would be convenient, for example, if we could refer to the natural numbers 0, 1, 2, ... and ordered pairs from within our theory (as we have several times in discussions of our theory). We could invent appropriate axioms for each individual "type" of object. However, we want to ensure that no natural number is T, F, or an ordered pair. In addition, we want to specify how the primitive functions behave on "unexpected" arguments (e.g., what does the successor function return when given T as an argument?).{FNOTE |One way to make sure that T is not a number or to . escape from asking what is the successor of T is . to employ a typed syntax. Indeed, Aubin {REF AUBIN} and Cartwright {REF CARTWRIGHT} . have implemented theorem provers for recursive functions that use typed syntax. . However, we have grown so accustomed to the untyped . syntax of predicate calculus, set theory, LISP, MACRO-10, and POP-2 that . we simply do not like typed syntax.|} Because of considerations such as these, we address the general problem of extending the theory by adding a new "type" of object. Among the most common objects in the applications of our theory are what we will call "shells."{FNOTE |Our shells are inspired by the "structures" used by Burstall {REF BURSTALL}. . Recently, Oppen {REF OPPEN} has established the decidability of a . theory similar to our shell theory.|} A shell can be thought of as a colored n-tuple with restrictions on the colors of objects that can occupy its components. For example, the natural numbers can be thought of as shells:##a number is either a blue 0 or a blue 1-tuple containing another blue object (namely, the predecessor of the tuple). Ordered pairs can be red 2-tuples containing arbitrary objects. The type consisting of lists of numbers can be either the green, empty list of numbers or else green 2-tuples, , such that x is a number (blue) and y is a list of numbers (green). The fact that ordered pairs and lists of numbers have different colors prevents an ordered pair consisting of a number and a list of numbers from being confused with a list of numbers. Because it is useful to be able to extend the theory by adding the axioms defining a new shell class and because the required set of axioms can be represented schematically, we will adopt a notational shorthand for adding new shells. We now specify informally the properties of a shell. The basic function for a shell class is one that "constructs" n-tuples of the appropriate color (e.g., the successor function or the pairing function). It is convenient if each shell class can (optionally) have one object that is in the class but is not such an n-tuple (e.g., 0 and the empty list of numbers). Because we will have many kinds of shells, we will need a function, called the "recognizer" of the class that returns T on objects of the class and F otherwise. We also require n "accessor" (or "destructor") functions associated with the class, that, when given an n-tuple of the right color, return the corresponding components of the n-tuple (e.g., the predecessor function is the "accessor" corresponding to the shell for numbers). Finally, we posit that any object in the class can be generated with a finite number of "constructions" starting with the bottom object and objects not in the class. This is arranged by assuming a certain function to be a well-founded relation (e.g., one under which the predecessor of a nonzero number is smaller than the number itself). Because we wish to have restrictions on the types of objects that can be components of a shell n-tuple, we must adopt some convention for specifying the restriction. We also adopt conventions for specifying what a constructor returns when one of its arguments fails to meet the required restriction and what an accessor returns when given an object of the wrong type as an argument. We require that there be associated with each argument position of each shell constructor a "type restriction" that limits the colors of objects that may occupy that component. The restriction is expressed in one of two ways:##(a) that an object must have a color that is a member of an explicitly given set of previously (or currently being) axiomatized colors, or (b) that an object may __not_ have a color in such a set. The type restriction is written as a propositional term (see condition (b) below). We also require that each argument have a "default value" that is permitted by the type restriction to occupy the corresponding component of the shell tuple. When one of the arguments to a constructor does not satisfy the corresponding type restriction, the default value for that argument position is used in its place. Finally, we arrange that the accessor for a position return the corresponding default value when given either the bottom object or an object of the wrong color. .STARRS .BEGIN .CENTER The Shell Principle .END .ASIS To __add the shell_ const __of_ n __arguments_ __with_ (optionally, __bottom object_ (btm),) __recognizer_ r, __accessors_ ac[&1], ..., ac[&n], __type restrictions_ tr[&1], ..., tr[&n], __default values_ dv[&1], ..., dv[&n], __and_ __well-founded relation_ wfn, where: (a) const is a new function symbol of n arguments, (btm is a new function symbol of no arguments, if a bottom object is supplied), r, ac[&1], ..., ac[&n] are new function symbols of one argument, wfn is a new function symbol of two arguments, and all the above function symbols are distinct; (b) each tr[&i] is a term that mentions no symbol as variable besides Xi and mentions no symbol as a function symbol besides IF, TRUE, FALSE, previously introduced shell recognizers, and r; and (c) if no bottom object is supplied, the dv[&i] are bottom objects of previously introduced shells, and for each i, (IMPLIES (EQUAL Xi dv[&i]) tr[&i]) is a theorem; if a bottom object is supplied, each dv[&i] is either (btm) or a bottom object of some previously introduced shell, and for each i, (IMPLIES (AND (EQUAL Xi dv[&i]) (r (btm))) tr[&i]) is a theorem, means to extend the theory by doing the following (using T for (r (btm)) and F for all terms of the form (EQUAL x (btm)) if no bottom object is supplied): (1) assume the following axioms: (OR (EQUAL (r X) T) (EQUAL (r X) F)), (r (const X1 ... Xn)), (r (btm)), (NOT (EQUAL (const X1 ... Xn) (btm))), and (IMPLIES (AND (r X) (NOT (EQUAL X (btm)))) (EQUAL (const (ac[&1] X) ... (ac[&n] X)) X)); (2) for each i from 1 to n, assume the following axiom: (IMPLIES tr[&i] (EQUAL (ac[&i] (const X1 ... Xn)) Xi)); (3) for each i from 1 to n, assume the following axiom: (IMPLIES (OR (NOT (r X)) (EQUAL X (btm)) (AND (NOT tr[&i]) (EQUAL X (const X1 ... Xn)))) (EQUAL (ac[&i] X) dv[&i])); (4) assume the axioms: (NOT (r T)) and (NOT (r F)); (5) for each recognizer, r', of a shell class previously added to the theory, assume the following axiom: (IMPLIES (r X) (NOT (r' X))); (6) assume the axiom: (wfn X Y) = (OR t (AND (r Y) (NOT (EQUAL Y (btm))) (OR (EQUAL X (ac[&1] Y)) ... (EQUAL X (ac[&n] Y))))), where t is the term (FALSE) if no shell has been added previously, and otherwise is (wfn' X Y) where wfn' is the well-founded relation name for the last shell previously added; and (7) assume wfn denotes a well-founded relation. .ENDASIS If the tr[&i] are not specified, they should each be assumed to be T. .STARRS The n axioms in (3) specify what the values of the accessors are when given "unexpected" arguments. It is possible to prove the consistency of the theory resulting from the addition of a finite number of shells by exhibiting a model. A suitable model may be constructed by representing a (nonbottom) member of a shell class having n accessors as an n+1-tuple whose first component is an integer encoding the "color." Note that merely because we add a finite number of shells we are not assured that every object in the world is in one of our shell classes. That is, we do not have an axiom that says: for any x, x is either T, or x is F, or x satisfies one of the shell recognizers. Indeed, this is an intended feature of the shell principle; we desire that any extension produced by adding shells can be further extended by additional shells without giving rise to inconsistency. From the point of view of modeling programming language constructs, shells are valuable. They can play a role in the semantics of the usual kinds of "records" since records are often n-tuples with type restrictions on their components. Shells can also be used to model other kinds of common programming objects, such as the integers, atomic objects (words, capabilities, characters, file names), push down stacks, and character strings. In the next three {SUBSECTIONORSECTION}s, we will use shells to add the natural numbers, literal atoms, and ordered pairs to our theory. .SS |Natural Numbers| We now axiomatize the natural numbers, 0, 1, 2, etc. using the shell principle: .STARRS .ASIS Shell Definition. Add the shell ADD1 of one argument with bottom object (ZERO), recognizer NUMBERP, accessor SUB1, type restriction (NUMBERP X1), default value (ZERO), and well-founded relation SUB1P. .ENDASIS .STARRS This axiomatizes a new type of object we will call the "numbers." The numbers consist of the new object (ZERO) and all of the objects returned by the new function ADD1. We now informally repeat the axioms added by the shell principle. The numbers in parentheses indicate the corresponding clause of the definition of the shell addition principle. (1) The function NUMBERP (which recognizes numbers), always returns either T or F, and returns T on (ADD1 X) and (ZERO). (ZERO) is never returned by ADD1. If X is a non-(ZERO) number, then (ADD1 (SUB1 X)) is X. (2) If X is a number, then (SUB1 (ADD1 X)) is X. (3) SUB1 returns (ZERO) if applied to a nonnumber, (ZERO), or (ADD1 X1) when X1 is a nonnumber. (4) T and F are nonnumeric. (5) Because no other shells have been added, clause (5) does not contribute any axioms. (6) We define the function SUB1P so that (SUB1P X Y) is T if Y is a number other than (ZERO) and X is (SUB1 Y), and (SUB1P X Y) is F otherwise. (7) Finally, we assume that SUB1P is a well-founded relation. Note the fundamental nature of the assumption that SUB1P is a well-founded relation:# by virtue of this assumption (and our principle of induction), (P X) can be proved, for all X, by proving (P X) when X is not a number, proving (P (ZERO)), and proving that if X is a number other than (ZERO), then (P (SUB1 X)) implies (P X). Among the theorems that can be derived from the above axioms is the theorem that if X and Y are numeric, then (ADD1 X) is equal to (ADD1 Y) if and only if X is equal to Y. See {APP APPSHELL}. We will abbreviate (ZERO) as 0, and any well-formed nest of ADD1's around a 0 as the decimal numeral expressing the number of ADD1 terms in the nest. Thus, 1 is an abbreviation for (ADD1 0), 2 is an abbreviation for (ADD1 (ADD1 0)), etc. .SS |Literal Atoms| We want to be able to prove theorems about functions that manipulate symbols. For example, in {YONSEC SECCOMPILER} we prove the correctness of a function that translates from symbolic arithmetic expressions to sequences of instructions for a hand-held calculator. We write symbols as sequences of characters delimited by quotation marks (e.g., "X" and "ABC"). We could adopt the convention that "X", for example, was an abbreviation for 24. Such a convention is part of the logician's standard method for representing terms, known as Goedelization. However, we want to arrange for the symbols to be different from the integers. To this end we add a new shell class called the "literal atoms," and we adopt a syntactic convention that translates from quoted character sequences to literal atoms. The shell class is recognized by the new Boolean function LITATOM. The new function PACK, of one argument, takes an arbitrary object and returns a literal atom. (The name PACK derives from the INTERLISP operation for constructing literal atoms by concatenating sequences of characters.) (PACK x) is the same literal atom as (PACK y) if and only if x is the same object as y. The new function UNPACK, given a literal atom, returns the object used to construct it. .STARRS .ASIS Shell Definition. Add the shell PACK of one argument with bottom object (NIL), recognizer LITATOM, accessor UNPACK, default value 0, and well-founded relation UNPACKP. .ENDASIS .STARRS Note that since ADD1 was the last shell added and the well-founded relation for it was SUB1P, the new well-founded relation UNPACKP holds between X and Y either if SUB1P holds between X and Y or if X is the result of UNPACKing (the non-(NIL) literal atom) Y. We now adopt a convention for abbreviating literal atoms as symbols. We suppose an enumeration s[&1], s[&2], s[&3], ... of all symbols (character sequences) except "NIL". When we write "NIL" in a term position, it is an abbreviation for (NIL). When we write s[&i] delimited by quotation marks in a term position, it is an abbreviation for (PACK i). .SS |Ordered Pairs| We axiomatize ordered pairs as follows: .STARRS .ASIS Shell Definition. Add the shell CONS of two arguments with recognizer LISTP, accessors CAR and CDR, default values "NIL" and "NIL", and well-founded relation CAR.CDRP. .ENDASIS .STARRS We sometimes think of ordered pairs as sequences, binary trees, or terms. For example .ASIS (CONS 1 (CONS 2 (CONS 3 "NIL"))) .ENDASIS may be thought of as the sequence 1, 2, 3. (CONS (CONS 1 2) 3) may be thought of as the binary tree: .ASIS * / \ * 3 / \ 1 2 .ENDASIS Finally, .ASIS (CONS "PLUS" (CONS "X" (CONS 3 "NIL"))). .ENDASIS may be thought of as the term (PLUS X 3). .STARRS Because nests of CARs and CDRs are frequently used, we provide a definition for each function symbol beginning with the letter C, ending with the letter R, and containing only A's and D's in between. The body of the definition is just the appropriate nest of C__A_Rs and C__D_Rs. For example, .ASIS Definition. (CADDR X) = (CAR (CDR (CDR X))). .ENDASIS .STARRS .SS |Definitions| We have already defined certain simple functions, such as AND, OR, NOT, and IMPLIES. For example, .ASIS Definition. (AND P Q) = (IF P (IF Q T F) F). .ENDASIS Another simple function we define is ZEROP; it returns T if its argument is virtually 0 (in the sense that ADD1 and SUB1 treat it as 0) and F otherwise: .ASIS .STARRS Definition. (ZEROP X) = (OR (EQUAL X 0) (NOT (NUMBERP X))) .STARRS .ENDASIS In general, if the function symbol f is new, if x[&1], ..., x[&n] are distinct variables, if the term body mentions no symbol as a variable other than these x[&i], and if body does not mention f as a function symbol, then adding the axiom: .ASIS (f x[&1] ... x[&n]) = body .ENDASIS is a proper way to define a new function. Indeed, any use of the symbol f as a function symbol in a term, such as (f t[&1] ... t[&n]), can be completely eliminated by replacing the term with the result of substituting $a{, ..., } into body. However, one apparently mild generalization of the above scheme results in our being able to define functions that are considerably more interesting. This generalization allows the use of f as a function symbol in the body of its own definition. For example, to define a function that returns the integer sum of its two arguments we could write: .ASIS Definition. (SUM X Y) = (IF (ZEROP X) Y (ADD1 (SUM (SUB1 X) Y))). .ENDASIS Unlike our previous definitions, the body of SUM mentions the function symbol being defined. That is, SUM is defined recursively. Nevertheless, SUM is well-defined. For example, consider (SUM 3 4). .ASIS (SUM 3 4) = (ADD1 (SUM 2 4)) = (ADD1 (ADD1 (SUM 1 4))) = (ADD1 (ADD1 (ADD1 (SUM 0 4)))) = (ADD1 (ADD1 (ADD1 4))) = 7. .ENDASIS If we were to allow the new function symbol to occur arbitrarily in the right-hand side, we could define all "general recursive" functions. However, we could also fall into inconsistency. For example, were we to add the axiom: .ASIS (RUSSELL X) = (IF (RUSSELL X) F T), .ENDASIS our theory would be inconsistent. To sketch a proof:##(RUSSELL X) must be equal to F or not equal to F. If the former, then by the definition of RUSSELL it follows that (RUSSELL X) = T, a contradiction. If the latter, it follows that (RUSSELL X) = F, another contradiction. Q.E.D. We now present a principle of definition that allows the introduction of recursive functions. The principle will not allow us to introduce all "general recursive" functions or even all "recursive" functions.{FNOTE |See Peter {REF PETER} for a thorough treatment . of general recursive functions.|} However, it will permit the definition of almost all the functions in which we are currently interested. And we shall prove that every application of the principle is sound (unlike the axiomatization of RUSSELL above). .STARRS .BEGIN .CENTER The Definition Principle .END In stating our principle of definition below, we say that a term is f-__free_ if the symbol f does not occur in the term as a function symbol. We say that a term t __governs_ an occurrence of a term s in a term b either if b contains a subterm of the form (IF t p q) and the occurrence of s is in p, or if b contains a subterm of the form (IF t' p q), where t is (NOT t') and the occurrence of s is in q. Thus, P and (NOT Q) govern the first occurrence of S in: .ASIS (IF P (IF (IF Q F S) S R) T). .ENDASIS Note that P and (IF Q F S) govern the second occurrence of S. Our principle of definition is: .ASIS __To define_ f __of_ x[&1], ..., x[&n] __to be_ body (usually written "Definition. (f x[&1] ... x[&n]) = body"), where: (a) f is a new function symbol of n arguments; (b) x[&1], ..., x[&n] are distinct variables; (c) body is a term and mentions no symbol as a variable other than x[&1], ..., x[&n]; and (d) there is a well-founded relation denoted by a function symbol r and a function symbol m of n arguments, such that for each occurrence of a subterm of the form (f y[&1] ... y[&n]) in body and the f-free terms t[&1], ..., t[&k] governing it, it is a theorem that: (IMPLIES (AND t[&1] ... t[&k]) (r (m y[&1] ... y[&n]) (m x[&1] ... x[&n]))), means to add as an axiom the defining equation: (f x[&1] ... x[&n]) = body. .ENDASIS .STARRS We now illustrate an application of the principle of definition. Imagine that LESSP is well-founded and that the axioms CAR.LESSP and CDR.LESSP have been added as in {YONSEC SECOVERVIEW}. The defining equation for MC.FLATTEN is added to our theory by the following instantiation of our principle of definition. f is the function symbol MC.FLATTEN; n is 2; x[&1] is X and x[&2] is ANS; body is the term: .ASIS (IF (LISTP X) (MC.FLATTEN (CAR X) (MC.FLATTEN (CDR X) ANS)) (CONS X ANS)); .ENDASIS r is LESSP; m is the function symbol COUNT1, where (COUNT1 X Y) is defined to be (COUNT X). The two theorems required by (d) are: .ASIS (IMPLIES (LISTP X) (LESSP (COUNT1 (CDR X) ANS) (COUNT1 X ANS))), and (IMPLIES (LISTP X) (LESSP (COUNT1 (CAR X) (MC.FLATTEN (CDR X) ANS)) (COUNT1 X ANS))). .ENDASIS Both theorems are easily proved from CAR.LESSP, CDR.LESSP and the definition of COUNT1. Note that the second theorem is proved before any axiom about MC.FLATTEN has been posited, even though MC.FLATTEN is used as a function symbol in the theorem. If we have defined (f x[&1] ... x[&n]) to be body, then we say that body is the __body_ of f and that x[&i] is the i^^th^ __formal parameter_ of f. If body mentions f as a function symbol, we say that f is __recursive_ and otherwise we say that f is __nonrecursive_.{FNOTE |This is potentially confusing, since in both cases the . function is (general) recursive in the usual mathematical sense. No . confusion should arise from our convention -- which is derived from everyday . usage in computer programming -- since we will nowhere discuss in this . book functions that are not general recursive in the mathematical . sense. Our definition principle does not permit mutually recursive . definitions. If f were defined in terms of a new function g, . then after the definition, g would no longer be new, and . hence g could not be defined.|} If f has not been defined but has been mentioned as a function symbol of n arguments in an axiom, we say that Xi is the i^^th^ __formal parameter_ of f, for i from 1 to n. We now offer a justification for admitting recursive definitions. This justification will relieve the fears raised by RUSSELL. Roughly speaking, we shall prove that for each correct application of the definition principle, we can prove that there exists a unique function f that satisfies the defining equation (f x[&1] ... x[&n]) = body. We shall construct the desired function using a standard set-theoretic method of partial functions that "approximate" the desired function. Unfortunately, different set theories supply different answers to the question: what is the value of applying a function to an object not in the function's domain. Instead of adopting a particular theory of sets, we shall instead make sure in the following proof not to apply any function to an object not in the function's domain. Furthermore, we shall assume the existence of some universal set D to be the domain of discourse for all the functions that we axiomatize in our theory. To be precise, we assume that: .CROWN(8,8,0) There exists a set D such that each function symbol f mentioned as a function symbol in any axiom denotes a function whose domain is D^n and whose range is a subset of D, where n is the number of arguments of f. .ENDCROWN If G is a function whose domain is a subset of D^n, for some n, and whose range is a subset of D, then the __extension_ of G is the function on D^n to D that is defined to be (G X1 ... Xn) if is a member of the domain of g and is defined to be (TRUE) otherwise. Suppose that we have in mind some specific f, x[&1], ..., x[&n], body, r, and m and suppose they satisfy the conditions (a) through (d) of the definition principle. Before we add the defining axiom (f x[&1] ... x[&n]) = body, we wish to prove that there exists a unique function f defined on D^n to D satisfying the equation. Without loss of generality, suppose that f, x[&1], ..., x[&n], r, and m are the symbols FN, X1, ..., Xn, R, and M respectively. Let us adopt the notational convention that b[s] is an abbreviation for the term obtained by replacing every occurrence of FN as a function symbol in the term b with the symbol s. (For example, if term is (ADD1 (FN X1)), then term[G] is an abbreviation for (ADD1 (G X1)).) Let RM be the well-founded relation defined on n-tuples by (RM ) = (R (M U1 ... Un) (M V1 ... Vn)). Let us say that a subset S of D^n is RM-__closed_ if and only if every member of D^n RM-smaller than a member of S is itself a member of S. Let us say that a function H is __partially correct_ if (a) its domain is an RM-closed subset of D^n, (b) its range is a subset of D, and (c) if H' is the extension of H, then for each in the domain of H, (H X1 ... Xn) = body[H']. We now prove a lemma that is used frequently below. Its proof is quite tedious. .ASIS Lemma. Suppose: F1 is a function whose domain is a subset of of D^n and whose range is a subset of D, F1' is the extension of F1, G1 is partially correct, G1' is the extension of G1, is in D^n, and F1 and G1 are defined and agree upon every member of D^n that is RM-smaller than . Then body[F1'] = body[G1']. .ENDASIS Proof. Let subterm[&1], ..., subterm[&k] be an enumeration of the occurrences of the subterms of body, and suppose that if the term subterm[&i] is a proper subterm of the term subterm[&j] then i is RM-smaller than and so is . But F1 and G1 are defined and agree on members of D^n RM-smaller than . Thus, *1 follows from the hypothesis that tests[&i] /$m-1= F and the *a. Q.E.D. We now turn to the construction of the unique function satisfying the defining equation (FN X1 ... Xn) = body. Let F0 be the union of all partially correct functions. We will prove that F0 is the desired function by demonstrating that (a) F0 is a function, (b) F0 is partially correct, and (c) the domain of F0 is D^n. The uniqueness of F0 follows from (a), (b), (c), and the fact that any other function defined on D^n to D satisfying the defining equation is partially correct and hence a subset of F0. Proof that F0 is a function. If F0 is not a function it is multiply defined somewhere. Let be an RM-minimal member of D^n such that for some two distinct values, Z1 and Z2 say, both <,Z1> and <,Z2> are members of F0. Let F1 and G1 be partially correct functions that contributed <,Z1> and <,Z2> to F0. Let F1' and G1' be the extensions of F1 and G1. Both F1 and G1 are defined upon all members of D^n RM-smaller than because both are partially correct. F1 and G1 have the same values on all members of D^n RM-smaller than because is RM-minimal. Therefore, by the lemma, body[F1'] = body[G1']. But Z1 = (F1 X1 ... Xn) = body[F1'] = body[G1'] = (G1 X1 ... Xn) = Z2 because both F1 and G1 are partially correct. Q.E.D. Proof that F0 is partially correct. The domain of F0 is an RM-closed subset of D^n because it is the union of RM-closed subsets of D^n. The range of F0 is a subset of D. Let F0' be the extension of F0. Let be any member of the domain of F0 such that (F0 X1 ... Xn) /$m-1= body[F0']. Let G be a partially correct function with in its domain such that (G X1 ... Xn) = (F0 X1 ... Xn). Let G' be the extension of G. By applying the lemma to F0, F0', G, G', and we infer that body[F0'] = body[G']. But (F0 X1 ... Xn) = (G X1 ... Xn) = body[G'] = body[F0']. Q.E.D. Before proving that the domain of F0 is D^n we adopt the notational convention that body' is an abbreviation for the result of substituting Yi for Xi in body[F0']. For example, if body is the term (IF (LISTP X1) (FN X1 X2) F), then body' is (IF (LISTP Y1) (F0' Y1 Y2) F). Proof that the domain of F0 is D^n. Suppose that F0 is not defined on every element of D^n. Let be an RM-minimal element of D^n not in the domain of F0. Let F0' be the extension of F0. Let G be the function that results from adding to F0 the ordered pair <,body'>. If we can show that G is partially correct a contradiction will arise because then G would be a subset of F0 by the definition of F0. The domain of G is an RM-closed subset of D^n because it was formed by adding to an RM-closed subset of D^n an RM-minimal element of D^n not in that subset. Let G' be the extension of G. We need to show that for every n-tuple in the domain of G that (G X1 ... Xn) = body[G']. For every in the domain of G, we may apply the lemma for G, G', F0, F0', and to infer that body[G'] = body[F0']. If = , then (G X1 ... Xn) = body[F0'] = body[G']. If /$m-1= then (G X1 ... Xn) = (F0 X1 ... Xn) = body[F0'] = body[G']. Q.E.D. That concludes the proof that the definition principle is sound. No constructivist would be pleased by the foregoing justification of recursive definition because of its freewheeling, set-theoretic character. The truth is that induction and inductive definition are more basic than the truths of high-powered set theory, and it is slightly odd to justify a fundamental concept such as inductive definition with set theory. We have presented this proof only to provide the careful reader with some clear talk about our definition principle. The only other kind of discussion we might have presented would have consisted of examples and the truly hand-waving phrase "and so on." One of our teachers, Paul Lorenzen, once proclaimed that the correct way to introduce induction to a student in an ideal society was simply to draw strokes: |, ||, |||, ||||, and so on until the student "caught on." .SS |Lexicographic Relations|, SSLEX: Our theory requires one more concept:##the idea of lexicographic relations. .STARRS To define l to be the __lexicographic relation induced by_ r and s, where: .ASIS (a) l is a new function symbol of 2 arguments, (b) r and s are function symbols of 2 arguments, and (c) neither r nor s is l, means to add as an axiom the following defining equation: (l P1 P2) = (OR (r (CAR P1) (CAR P2)) (AND (EQUAL (CAR P1) (CAR P2)) (s (CDR P1) (CDR P2)))). .ENDASIS .STARRS That is, l orders pairs of objects by first comparing their CAR components using r, but using s on their CDR components if the test with r fails and the CARs are equal. (The name "lexicographic" is inspired by the alphabetic order used by lexicographers.) .STARRS If r and s denote well-founded relations and l is defined to be the lexicographic relation induced by r and s, then l denotes a well-founded relation. .STARRS Proof. Suppose that x[&1], x[&2], ... were an infinite sequence and that for all positive i, (l x[&i+1] x[&i]) /$m-1= F. By the definition of l, (r (CAR x[&i+1]) (CAR x[&i])) /$m-1= F or (CAR x[&i+1]) = (CAR x[&i]). But since r is well-founded, the sequence (CAR x[&1]), (CAR x[&2]), ... cannot be infinitely descending in r. Hence, for some j, for all positive p, (CAR x[&j]) = (CAR x[&j+p]). But the sequence (CDR x[&j]), (CDR x[&j+1]), ... must then be infinitely descending in s, a contradiction. Q.E.D. .SS |LESSP and COUNT|, SSLESSP: We have now finished defining the formal theory that we use as the logical basis of our theorem-proving system. We now use the theory to define two functions, LESSP and COUNT, that play central roles in our proofs (but have no role in the formal definition of the theory). LESSP and COUNT are the well-founded relation and measure function we use most often in applying our principles of induction and definition. (LESSP X Y) returns T if X is less than Y and F otherwise. LESSP treats nonnumeric arguments as 0. LESSP determines whether X is less than Y by counting them both down to 0, seeing which gets there first. .STARRS .ASIS Definition. (LESSP X Y) = (IF (ZEROP Y) F (IF (ZEROP X) T (LESSP (SUB1 X) (SUB1 Y)))). .ENDASIS .STARRS Since (IMPLIES (NOT (ZEROP X)) (SUB1P (SUB1 X) X)) is a theorem, the first argument of LESSP gets SUB1P smaller in the recursive call. (In fact, so does the second argument.) Thus, LESSP is admitted under the principle of definition. We claim that LESSP is a well-founded relation. That is, we claim there is no infinite sequence x[&1], x[&2], ... such that x[&i+1] is LESSP-smaller than x[&i]. It is easy to see how to prove that LESSP is well-founded in a suitable theory of sets, since SUB1P is well-founded, and x is LESSP-smaller than y if and only if a finite number of SUB1s will reduce y to x (when both are numbers). We cannot state or prove the well-foundedness of LESSP within our theory. .STARRS We assume LESSP to be a well-founded relation. .STARRS By virtue of this assumption, it is permitted to make induction arguments in which some measure gets LESSP-smaller in the induction hypotheses. Similarly, it is permitted to define recursive functions in which some measure gets LESSP-smaller in the recursive calls. A particularly useful measure is the "size" of a shell object obtained by adding one to the sum of the sizes of its components. We first define the addition function for the nonnegative integers. We could use the function SUM defined above. However, SUM suffers from the disadvantage of sometimes returning a nonnumber (it returns Y, whatever that is, when X is 0). SUM is thus not commutative (e.g., (SUM 0 T) = T, while (SUM T 0) = 0). We thus make the following definitions: .STARRS .ASIS Definition. (FIX X) = (IF (NUMBERP X) X 0) Definition. (PLUS X Y) = (IF (ZEROP X) (FIX Y) (ADD1 (PLUS (SUB1 X) Y))) .ENDASIS We adopt the notational convention of writing (PLUS a b c) for (PLUS a (PLUS b c)), etc. Now assume we have added all the shells we will use. We define the function COUNT to return 0 on bottom objects, to return 1 plus the sum of the COUNTs of the components of a nonbottom shell object, and to return 0 on any nonshell object. For example, if the only shells we were ever to add were ADD1, PACK, and CONS, we would define COUNT as: .ASIS Definition. (COUNT X) = (IF (NUMBERP X) (IF (EQUAL X 0) 0 (ADD1 (COUNT (SUB1 X)))) (IF (LITATOM X) (IF (EQUAL X "NIL") 0 (ADD1 (COUNT (UNPACK X)))) (IF (LISTP X) (ADD1 (PLUS (COUNT (CAR X)) (COUNT (CDR X)))) 0))). .ENDASIS .STARRS The immediately preceding definition of COUNT would be admitted under the principle of definition since at each stage the argument is CAR.CDRP-smaller. In general, the definition of COUNT is admitted under the principle of definition, because at each stage the argument is smaller according to the well-founded relation of the last shell. To permit the illusion that shells may be added at any time, our theorem-proving program does not actually employ the full definition of COUNT, but instead records (a) for each shell const the theorems: .ASIS (EQUAL (COUNT (const X1 ... Xn)) (ADD1 (PLUS (IF tr[&1] (COUNT X1) 0) ... (IF tr[&n] (COUNT Xn) 0)))) and (EQUAL (COUNT (btm)) 0) .ENDASIS (omitting the latter if const has no bottom object), and (b) for each i from 1 to n, the theorem: .ASIS (IMPLIES (AND (r X) (NOT (EQUAL X (btm)))) (LESSP (COUNT (ac[&i] X)) (COUNT X))) .ENDASIS (using T for (NOT (EQUAL X (btm))) if const has no bottom object). These theorems may be proved from the shell axioms and the definition of COUNT. .SS |Conclusion| This concludes the discussion of our formal theory. We recap the topics presented: .CROWN(8,0,0) We defined with axioms certain functions including IF and EQUAL. We introduced the idea of well-founded relations. We introduced a principle of induction. We introduced a general mechanism for adding "new" types of "colored" n-tuples called "shells." We used the shell principle to add axioms for the natural numbers, literal atoms, and ordered pairs. We introduced a principle of definition which allows the introduction of recursive functions. We introduced the concept of a lexicographic relation. We used the principle of definition to introduce the usual "less than" function, assumed it was well-founded, and defined the measure function COUNT that computes the size of an object. .ENDCROWN .SEC |THE CORRECTNESS OF A TAUTOLOGY CHECKER|, SECTAUTOLOGYCHECKER: Before we describe how we prove theorems in the theory just presented, it is important that the reader be familiar with the theory itself. In addition, it is useful to go through the proofs of some difficult theorems, so that the reader gets a feel for what is coming in subsequent {SECTIONORCHAPTER}s. Finally, readers unfamiliar with mechanical theorem-proving may be curious about how one proves theorems mechanically. Since all three of these objectives should be addressed before we begin to present our proof techniques, we have chosen to illustrate them all in a rather novel example: the mechanical proof of the correctness of a simple mechanical theorem prover. In particular, we prove the correctness of a decision procedure for the propositional calculus. In the standard development of logic, the propositional calculus is presented first. As in our theory, it often forms part of the logical underpinnings of richer theories. In addition, it offers a simple way of introducing certain important ideas such as soundness, completeness, and decision procedures. Because of its foundational role, discussions of the propositional calculus are usually carried on in the informal "metalanguage." For example, a common definition of the value of the formula "p & q" is that it is "true if both p and q have the value true, and false otherwise." In this {SECTIONORCHAPTER} we exercise the expressive power of our theory, and clarify certain aspects of it, by formalizing the semantics of a version of propositional calculus in our theory. We then introduce certain very simple theorem-proving ideas (such as how to apply a theorem as a rewrite rule, and how to keep track of what assumptions govern a subformula of a formula) by writing, as a recursive function in our theory, a decision procedure for the propositional calculus. Finally, we illustrate some of our proof techniques by proving that the decision procedure is well-defined, recognizes only tautologies, and recognizes all tautologies. The proofs described are actually carried out by our own mechanical theorem prover, and the discussion of the proofs illustrates the role of the human user in our automatic theorem-proving system. .SS |Informal Development| Throughout this {SECTIONORCHAPTER} we will be concerned with the set of terms constructed entirely from variables, T, F, and the function symbol IF. We call such terms "propositional IF-expressions." Examples of propositional IF-expressions are: .ASIS (IF A B C), (IF T F (IF A B C)), and (IF (IF P Q F) (IF P T Q) T). .ENDASIS Note that the first of these expressions sometimes has the value F (when A is T and B is F, for example) but sometimes does not have the value F (when A is T and B is T, for example). On the other hand, the second expression always has the value F, and the third expression never has the value F. We call a propositional IF-expression that never has the value F a "tautology." Note that any formula of the ordinary propositional calculus can be converted to an equivalent propositional IF-expression by using the definitions of AND, OR, NOT, and IMPLIES presented in {YONSEC SECTHEFORMALTHEORY}. (Throughout the remainder of this {SECTIONORCHAPTER} we shall use "expression" as a shorthand for "propositional IF-expression.") It is our aim in this {SECTIONORCHAPTER} to construct a procedure for deciding whether an expression is a tautology. Our first step is to indicate more precisely what we mean by "the value of an expression." Informally, let us say that v is an __assignment_ provided that v is a function, its domain includes T, F, and the variables, and v maps T to T and F to F. If v is an assignment, then by "the assignment of x under v" we mean v(x). Then the __value of the expression_ x __under the assignment_ v is defined recursively as: .ASIS if x has the form (IF p q r), then if the value of p under v is F then return the value of r under v, else return the value of q under v; else return the assignment of x under v. .ENDASIS We want to define a mechanical procedure that when given an expression x returns non-F if for every assignment v, the value of x under v is non-F, and returns F if for some assignment v, the value of x under v is F. We will call our procedure TAUTOLOGY.CHECKER. There are many ways to write TAUTOLOGY.CHECKER. The question: "What is the most efficient way to write TAUTOLOGY.CHECKER?" is actually one of the most important unsolved problems in computer science. One method, called the "truth table" method, considers all possible assignments of T and F to the variables in the given expression. The truth table method requires execution time exponential in the number of variables in the expression. No one knows a method that does not require exponential time in the worst case. Furthermore, no one has yet proved that all algorithms require exponential time in the worst case. The version of TAUTOLOGY.CHECKER that we present is more efficient than the truth-table method on one important class of expressions, namely those in "IF-normal form."{FNOTE |"This . nomenclature is an excellent example of the time-honored . custom of referring to a problem we cannot handle as abnormal, . irregular, improper, degenerate, inadmissible, and otherwise . undesirable." From Kelley {REF KELLEY}, on "normal" spaces.|} An expression x is said to be in __IF-normal form_ provided that no subterm of x beginning with an IF has as its first argument another term beginning with an IF. Of the three example formulas above, the first two are in IF-normal form and the last is not. When a formula is in IF-normal form, we can decide whether it is a tautology very easily: we consider each "branch" through the expression and require either that the tests through which we pass are contradictory or that the tests through which we pass force the output to be something other than F. Consider, for example, the expression: .ASIS (IF P (IF T T F) (IF Q (IF P F Q) T)). .ENDASIS There are five branches through this expression (one output per line). The first branch returns T. The second branch returns F but can never be taken because the second test, on T, never fails. The third branch can never be taken because the first test on P must have returned F so the second must also. The fourth branch returns Q, which is not F because the earlier test on Q determined that Q was not F. And the last branch returns T. So the expression is a tautology. Informally, then, we have a method for deciding which expressions in IF-normal form are tautologies. To use the method on expressions in general (rather that just those in IF-normal form), we convert the given expression into an equivalent one (that is, one that always has the same value) in IF-normal form. We achieve this normalization by applying the theorem .ASIS (EQUAL (IF (IF P Q R) LEFT RIGHT) (IF P (IF Q LEFT RIGHT) (IF R LEFT RIGHT))) .ENDASIS repeatedly, as a rewrite rule from left to right. That is, whenever we find an expression of the form (IF (IF p q r) left right), we replace it with the equivalent (IF p (IF q left right) (IF r left right)). Normalizing an expression may produce a formula that is exponentially larger than the one with which we started. So in the worst case, our procedure is at least exponential. The foregoing sketch of a decision procedure for the tautology problem is very informal. Below we reconsider both the problem and its solution very formally -- in fact we formalize both the problem and its solution using the theory presented in {YONSEC SECTHEFORMALTHEORY}. One way to view the formal presentation is as an interaction between four participants:##a "buyer" who wants to purchase a recursive function satisfying his specification of a tautology checker; an "implementor" who encodes his knowledge of theorem-proving in a recursive function claimed to meet the specifications; a "theorem prover" that proves that the implementor did his job; and a "mathematician user" who aids the theorem prover by suggesting that it prove certain lemmas. .SS |Formal Specification of the Problem| In this {SUBSECTIONORSECTION} we play the role of the buyer:##we specify our requirements by formally defining what a propositional IF-expression is, what the value of such an expression is, and what it means for a function to be a decision procedure. .SSS |Representing Expressions| Since we want to define functions on IF-expressions, we must represent IF-expressions as objects in our theory. From the point of view of general-purpose theorem-proving programs, the most natural and convenient way to represent terms is to represent variables as literal atoms and to represent the term (f x[&1] ... x[&n]) as the sequence whose CAR is the literal atom f and whose CDR is the sequence of objects representing the terms x[&1] through x[&n]. This is the representation we use in our theorem prover. However, this representation makes it awkward to refer to subterms. For example, if x represented (IF test left right), then in order to refer to the third argument one would write (CAR (CDR (CDR (CDR x)))). With ease of reading in mind, we represent IF-expressions in this {SECTIONORCHAPTER} by employing a new shell class (the green triples, say), called the IF.EXPRPs, which we introduced with .ASIS Shell Definition. Add the shell CONS.IF of three arguments with recognizer IF.EXPRP, accessors TEST, LEFT.BRANCH, and RIGHT.BRANCH, default values "NIL", "NIL", and "NIL", and well-founded relation TEST.LEFT.BRANCH.RIGHT.BRANCHP. .ENDASIS Thus, we represent the term: .ASIS (IF x y z) as (CONS.IF x' y' z') .ENDASIS where x', y', and z' are the representations of x, y, and z respectively. We use T and F to represent T and F respectively. For the purposes of this example we agree that any object other than T, F, or an IF.EXPRP represents a variable. .SSS |Formal Definitions of ASSIGNMENT and VALUE| To represent assignments we use the standard "association list" technique from LISP programming. An association list (or simply "alist") is a sequence of pairs interpreted as associating with the CAR of each pair the item of information in the CDR. The recursive function ASSIGNMENT interprets alists as assignments of values to terms. ASSIGNMENT returns the value assigned to a given term in a given alist (or F if it is not explicitly assigned). ASSIGNMENT assigns T and F to themselves. Here is the definition of ASSIGNMENT: .ASIS Definition. (ASSIGNMENT VAR ALIST) = (IF (EQUAL VAR T) T (IF (EQUAL VAR F) F (IF (NLISTP ALIST) F (IF (EQUAL VAR (CAAR ALIST)) (CDAR ALIST) (ASSIGNMENT VAR (CDR ALIST)))))). .ENDASIS (NLISTP x) is defined to be (NOT (LISTP x)). The formal definition of the value of the expression X under the assignment ALIST is: .ASIS Definition. (VALUE X ALIST) = (IF (IF.EXPRP X) (IF (VALUE (TEST X) ALIST) (VALUE (LEFT.BRANCH X) ALIST) (VALUE (RIGHT.BRANCH X) ALIST)) (ASSIGNMENT X ALIST)). .ENDASIS .SSS |The Formal Correctness Specifications| As the buyer we want a decision procedure for the propositional calculus. We now specify what we require of a decision procedure. First of all, we require that the decision procedure be introduced as a function. Let us call it TAUTOLOGY.CHECKER. Second, we require that if TAUTOLOGY.CHECKER recognizes an expression (in the sense that TAUTOLOGY.CHECKER returns something other than F when given the expression), then the expression must have a value other than F under all assignments{FNOTE |The purist may note that in a freewheeling set theoretic approach . to validity, one would consider all assignments rather . than merely the finite assignments to which we limit ourselves . when we represent assignments as finite alists. Of course, . no real damage is done, because (in a suitable theory of sets) . one can prove that it is sufficient to restrict one's attention to . assignments that assign a meaning to the finite number of variables . in the term in which one is interested.|}. Stated formally, this requirement is: .ASIS Theorem. TAUTOLOGY.CHECKER.IS.SOUND: (IMPLIES (TAUTOLOGY.CHECKER X) (VALUE X A)). .ENDASIS When specifying requirements, one must be very careful to say enough. The careless buyer might think that we have fully specified TAUTOLOGY.CHECKER. However, a function that always returned F (i.e., recognized nothing) would satisfy the above specification. We require more than that of TAUTOLOGY.CHECKER. In fact, we require that it recognize all tautologies; when TAUTOLOGY.CHECKER fails to recognize an expression there must exist an assignment for which the VALUE of the expression is F. Since we do not use existential quantification, how can we express the proposition that when TAUTOLOGY.CHECKER fails to recognize an expression there exists a falsifying assignment? The answer is that we require that somebody define a recursive function that explicitly constructs a falsifying assignment for a nontautological expression. We call the function FALSIFY. Then the statement that the tautology checker recognizes all tautologies is: .ASIS Theorem. TAUTOLOGY.CHECKER.IS.COMPLETE: (IMPLIES (NOT (TAUTOLOGY.CHECKER X)) (EQUAL (VALUE X (FALSIFY X)) F)). .ENDASIS That is, if the tautology checker fails to recognize an expression, then there is an assignment (namely (FALSIFY X)) for which the value of the expression is F. From our perspective as the buyer, the definition of FALSIFY is irrelevant. That is, if somebody were to supply us with a legal definition of TAUTOLOGY.CHECKER (and also one of FALSIFY) such that the above two conjectures were theorems, then we would believe that TAUTOLOGY.CHECKER was a decision procedure. .SS |The Formal Definition of TAUTOLOGY.CHECKER| We now take on the role of the implementor. The buyer's definitions and conjectures in the last {SUBSECTIONORSECTION} specify what is required of TAUTOLOGY.CHECKER. As the implementor, we now define a function we claim has the desired properties. Since the specified task is to write a simple mechanical theorem prover, it happens (in this example) that the implementor must appeal to some of the basic ideas in mechanical theorem-proving. .SSS |TAUTOLOGY.CHECKER| The definition of TAUTOLOGY.CHECKER requires two subsidiary concepts: .CROWN(8,0,0) NORMALIZE, which given an expression returns an equivalent expression in IF-normal form and TAUTOLOGYP, which given an expression in IF-normal form and a list of assumptions determines if the expression is never F under those assumptions. .ENDCROWN Given NORMALIZE and TAUTOLOGYP we define TAUTOLOGY.CHECKER as: .ASIS Definition. (TAUTOLOGY.CHECKER X) = (TAUTOLOGYP (NORMALIZE X) "NIL"). .ENDASIS .SSSS |NORMALIZE| Recall that the basic idea behind NORMALIZE is to apply the theorem .ASIS (EQUAL (IF (IF P Q R) LEFT RIGHT) (IF P (IF Q LEFT RIGHT) (IF R LEFT RIGHT))) .ENDASIS as a rewrite rule until we have removed all IFs from the tests of other IFs. Thus, to normalize an expression x we proceed as follows: .ASIS If x is not an IF.EXPRP, we return x. If x is of the form (IF test left right), then we ask whether test is of the form (IF p q r). If so, we return the result of normalizing the expression (IF p (IF q left right) (IF r left right)). If not, we return the expression (IF test left' right'), where left' and right' are the results of normalizing left and right. .ENDASIS The formal definition of this process is: .ASIS Definition. (NORMALIZE X) = (IF (IF.EXPRP X) (IF (IF.EXPRP (TEST X)) (NORMALIZE (CONS.IF (TEST (TEST X)) (CONS.IF (LEFT.BRANCH (TEST X)) (LEFT.BRANCH X) (RIGHT.BRANCH X)) (CONS.IF (RIGHT.BRANCH (TEST X)) (LEFT.BRANCH X) (RIGHT.BRANCH X)))) (CONS.IF (TEST X) (NORMALIZE (LEFT.BRANCH X)) (NORMALIZE (RIGHT.BRANCH X)))) X). .ENDASIS .SSSS |TAUTOLOGYP| Now that we can put an expression into normal form, we consider TAUTOLOGYP, which determines whether an expression, x, in IF-normal form, is never F. Recall that the basic idea is to explore every branch through the IF-expression and check that either the tests along the branch are contradictory or else that the output of the branch is forced by the tests to be non-F. Our definition requires that TAUTOLOGYP have a second argument, alist, used to remember the tests that have been seen and whether they are being assumed true or false on the current branch. Since all the tests in a normalized IF-expression are variables (or else the constants T or F) the alist of assumptions can also be thought of as an assignment to some of the variables in x. Here is how TAUTOLOGYP determines that x is never F under the assumptions in alist: .CROWN(8,0,0) If x is not an IF.EXPRP, then it is either T, F, or a variable. .CROWN(8,0,0) If x is T, then it is never F. If x is F, then clearly it is "sometimes" F. If x is neither T nor F, then it is a variable. If x is currently assumed non-F, then x is never F under alist; otherwise x is sometimes F. .ENDCROWN If x is an IF.EXPRP, say representing (IF test left right), then there are three possibilities: .CROWN(8,0,0) If test (which must be a variable, T, or F) is T or assumed non-F in alist, then x is never F under alist if and only if left never is. If test is F or assumed F in alist, then x is never F under alist if and only if right never is. Otherwise, x is never F under alist if and only if both: .CROWN(8,0,0) left is never F under the assumptions in alist plus the additional assumption that test is non-F, and right is never F under the assumptions in alist plus the additional assumption that test is F. .ENDCROWN .ENDCROWN .ENDCROWN To define TAUTOLOGYP formally we use four auxiliary functions: .CROWN(8,0,0) ASSIGNEDP, which determines whether a given variable is explicitly assumed F or non-F in a given alist, ASSIGNMENT (defined above), which returns the assumed value of a variable in an alist, ASSUME.TRUE, which adds to an alist the pair that indicates that a given variable is being assumed non-F, and ASSUME.FALSE, which adds to an alist the pair that indicates that a given variable is being assumed F. .ENDCROWN The definitions of these functions are in {APP APPTHMS}. ASSIGNEDP is very similar to ASSIGNMENT. ASSUME.TRUE and ASSUME.FALSE simply CONS the appropriate pair onto the alist. The formal definition of TAUTOLOGYP is: .ASIS Definition. (TAUTOLOGYP X ALIST) = (IF (IF.EXPRP X) (IF (ASSIGNEDP (TEST X) ALIST) (IF (ASSIGNMENT (TEST X) ALIST) (TAUTOLOGYP (LEFT.BRANCH X) ALIST) (TAUTOLOGYP (RIGHT.BRANCH X) ALIST)) (AND (TAUTOLOGYP (LEFT.BRANCH X) (ASSUME.TRUE (TEST X) ALIST)) (TAUTOLOGYP (RIGHT.BRANCH X) (ASSUME.FALSE (TEST X) ALIST)))) (ASSIGNMENT X ALIST)). .ENDASIS .SSS |Summary of the Simple Theorem-proving Ideas| As implementor of the buyer's specifications, we claim that our job is done:##TAUTOLOGY.CHECKER is a decision procedure for the propositional calculus. Before discussing the proof of this assertion we review the simple theorem-proving techniques illustrated: .CROWN(8,0,0) Terms can (and must) be represented as objects. In our case, we used the shell facility to define a new class of objects, called the IF-expressions. Both NORMALIZE and TAUTOLOGYP illustrate how terms can be explored mechanically. The function NORMALIZE shows how a theorem can be exhaustively applied as a rewrite rule in a mechanical way.