NOTE: This is a draft of notes for the upcoming 10/18 ACL2 seminar talk. Feedback is welcome. Details may change between now and the talk (and just after the talk). Last revised Tue Oct 17 10:40:35 2006. ----- Conservativity of the Defchoose Event Matt Kaufmann ACL2 seminar, 10/18/06 =============================================================================== ABSTRACT: This talk will discuss some of the logical foundations of ACL2, in particular for the defchoose event. The goal is to give a sense of the foundational issues for correctness of ACL2 and, in particular, of a so-called "forcing" argument for the admissibility of defchoose events. In order to fit into about an hour, the talk will avoid some technical details, instead pointing to additional reading for those interested in such details. The intention is to provide enough background and motivation so that those interested in such details will find that reading to be accessible. IMPORTANT NOTE ON PREREQUISITES: I will start with a very brief, very fast review of first-order logic so that we are all on the same page. Those who have not seen first-order logic may wish to contact me for a primer before the talk, or at least read the notes on "Review of first-order logic", as linked to from the seminar page. The rest of the talk is intended to be self-contained. But if you take a look at :doc defchoose, that might help in absorbing the talk in real time. =============================================================================== OUTLINE OF TALK: 0. Preliminaries and notation. 1. Review of first-order logic (fast!) a. Syntax b. Semantics c. Theories, completeness, and soundness d. Conservative extension e. Induction 2. Correctness story for ACL2: Brief synopsis of the "Structured Theory" paper. 3. Review of defchoose. 4. Conservativity of defchoose: introduction. 5. Conservativity of defchoose, following :doc conservativity-of-defchoose. =============================================================================== 0. Preliminaries and notation. During the talk I will only briefly skim the "Review of first-order logic" below. We write "a \in x" to abbreviate "a is an element of the set x", and "x \subset y" to abbreviate "x is a subset of y". We write phi(y) to indicate that y may be among the free variables of a formula phi, and phi(y:=x) to denote the result of substituting x for y in phi. Proofs are concluded with "-|". Theorems are concluded with "-|" when their proofs are omitted. =============================================================================== 1. Review of first-order logic (fast!) Reference: Any introductory mathematical logic book. ---------------------------------------- a. Syntax Terms: much like ACL2, a variable is a term, as is f(t1,...,tn) where f is a function symbol of arity n and t1 through tn are terms. A constant can be viewed as a zero-ary function; or we can allow constant symbols. (Either way works out.) Atomic formulas: P(t1,...,tn) where P is a relation symbol of arity n and t1 through tn are terms. Special case: = is a binary relation symbol and we write t1=t2. (First-order) Formulas: - Atomic formulas - ~P, where P is a formula [negation] - P1 & P2, where P1 and P2 are formulas [conjunction] - (exists x)P, where x is a variable and P is a formula [existential quantification] Note: (exists x)P "binds" all occurrences of x in P. All unbound variable occurrences are called "free". Abbreviations: - P1 \/ P2 abbreviates ~(~P1 & ~P2) - (forall x)P abbreviates ~(exists x)~P A "sentence" is a formula with no free variables. We sometimes refer to a formula in a setting where a sentence is expected, in particular with definitional formulas. We then view a formula as its "universal closure", obtained by prefixing the formula with a quantifier (forall x) for each variable x occurring free in the formula. ---------------------------------------- b. Semantics The basic idea is that a "structure" M (sometimes called an "interpretation") assigns a function f^M to each function symbol of its language and a relation R^M to each relation symbol of its language. Motivating example: The set of real numbers forms a structure where the + and * symbols are interpreted as real addition and multiplication, respectively, which we can denote +^R and *^R; the < symbol is interpreted as real-number "less than", which we can denote as <^R; and the field constants i_+ and i_* for additive and multiplcative identity are interpreted as the numbers 0 and 1, respectively. The real numbers thus form a structure: (R; +^R, *^R, 0, 1; <^R). A (first-order) structure is M = (U^M; f1^M, f2^M, ...; P1^M, P2^M, ...) where: - U^M is a set; - each fi^M is a function on U^M corresponding to fi, called the "interpretation of" fi, such that the arities match; and - similarly each Pi^M is a relation of appropriate arity on U^M interpreting the relation symbol Pi, of the same arity; and in particular, - implicitly =^M is equality. The "language of" M, M as above, is the set {f1,f2,...,P1,P2,...} of function and relation symbols interpreted in M. Assignment s into structure M: A map of variables into the set U^M. We write s(v) for the value of variable v under s (where v is in the domain of s). For a given structure M, The assignment s extends naturally to terms in the language of M: s^M(f(t1,...,tn)) = f^M(s^M(t1),...,s^M(tn)). We write s{x:=m} to denote the assignment s' that agrees with s except that s'(x) = m. We recursively define the satisfaction (truth) relation, "M |= A [s]" where M is a structure, A is a formula, and s is an assignment whose domain includes all the free variables of A: M |= P(t1,...,tn) [s] iff \in P^M. M |= P1 & P2 [s] iff both M |= P1 [s] and M |= P2 [s]. M |= (exists x)P [s] iff for some m \in U^M, M |= P [s{x:=m}]. Suppose that M1 and M2 are structures, U^M1 = U^M2, L(M1) \subset L(M2), and for every function and relation symbol S of L(M1), S^M1 = S^M2. In this case we say that M2 is an "expansion of" M1 (to L(M2)) and that M1 is a "reduct of" M2 (to L(M1)). If M is a structure and A is a sentence of L(M), we sometimes write "M |= A" to abbreviate "M |= A [s]" where s is the empty assignment. Equivalent, we may say that M is a "model of" A. Exercise: show that under these conditions, whether or not M |= A [s] is independent of s. Let M be a structure. We say that the set S \subset U^M is "first-order definable with parameters from" M (or "definable" for short) if for some formula phi, variable x, and assignment s into M that binds the free variables of phi except perhaps for x, S = {a \in U^M: M |= phi[s{x:=a}]. ---------------------------------------- c. Theories, completeness, and soundness First-order logic comes with a notion of "proof" that we won't define here. Different texts have different treatments, but the basic idea is that one can derive logical consequences. The "language of" a set T of (first-order) formulas, which we write as "L(T)", is the set of all function and relation symbols occurring in any element of T. If L is a set of function and relation symbols, then a "sentence of L" is a sentence whose function and relation symbols all belong to L. In some treatments, a "theory" is any set of first-order sentences. In most treatments, a "theory" is a set of first-order sentences that is "deductively closed": any sentence of L(T) that can be proved from T is a member of T. We sometimes say "A is a theorem of T" to mean that there is a proof of A from T. So, T is deductively closed iff every theorem of T is a member of T. If T is a theory and M is a structure such that L(T) \subset L(M), then we write "M |= T", and say "M is a model of T" to mean that M is a model of A for all A \in T. Suppose that T is a theory and A is a sentence. We write "T |- A" to mean that A is first-order derivable from the set T of axioms, and we write "T |= A" to mean that every model of T is a model of A, i.e.: for every structure M such that L(T) \subset L(M), if M |= T then M |= A. Now suppose A is a sentence of L(T), where T is a theory. "Soundness" and "Completeness" are sometimes described as follows: Soundness: "Syntactic entailment implies semantic entailment". Completeness: "Semantic entailment implies syntactic entailment". Indeed, the following properties hold of first-order logic. Soundness Theorem: If T |- A then T |= A. Completeness Theorem: If T |= A then T |- A. ---------------------------------------- d. Conservative extension Let T1 and T2 be theories. We say that T2 is "a conservative extension of" (or "conservatively extends") T1 if T1 \subset T2 and for every sentence A of L(T1) such that T2 |- A, then T1 |- A. Proposition (Model Conservativity). Suppose T1 \subset T2 are theories and that every structure M1 with language L(T1) that satisfies T1 has an expansion to a structure M2 with language L(T2) that satisfies T2. Then T2 is a conservative extension of T1. -| Actually, we need a slightly stronger version of this proposition that restricts to "countable" structures: a structure M is countable if U^M is the range of some function whose domain is the set of natural numbers. Then the proposition below follows from the one above together with these two classical results. - Compactness: If every finite subset of a theory T has a model, then T has a model. Proof sketch: The contrapositive is a straightforward corollary of completeness and soundness. -| - Lowenheim-Skolem: If L(T) is countable and T has a model, then T has a countable model. -| Proposition (Countable Model Conservativity). Suppose T1 \subset T2 are theories and that every countable structure M1 with language L(T1) that satisfies T1 has an expansion to a structure M2 with language L(T2) that satisfies T2. Then T2 is a conservative extension of T1. -| We observe that explicit definitions provide conservative extensions: Corollary. Let T1 be a theory and let T2 be the result of extending T1 by the universal closure of a sentence f(x1,...,xn) = u where u is a term of L(T1) with free variables among the distinct free variables x1,...,xn and f \notin L(T1). Then T2 is a conservative extension of T1. Proof: By the first proposition above, since if M |= T then we can interpret f as the function mapping m1,...,mn to s^M(u), where s is the function mapping xi to mi (1<=i<=n). -| In fact a sequence of explicit definitions forms a conservative extension, by the following trivial lemma. Proposition (Transitivity of Conservative Extension). If T2 is a conservative extension of T1, which is a conservative extension of T0, then T2 is a conservative extension of T0. -| The following is also trivial, and used below. Proposition (Conservativity for Weakening). Suppose T0 \subset T1 \subset T2 and T2 conservatively extends T0. Then T1 conservatively extends T0. -| ---------------------------------------- e. Induction The ACL2 system supports a form of mathematical induction that allows for a finite number of inductive hypotheses for each inductive step. However, in our treatment we will consider only "strong" mathematical induction. That principle says, roughly speaking, that for every set S of natural numbers, if membership of n in S is implied by membership of all smaller n in S, then S contains all natural numbers. Rather than quantify over all such sets S (which would constitute a second-order statement), "first-order induction" for a given language L is the set of universal closures of the following sentences, as phi ranges over formulas of L and x and y range over variables. (Think of phi(y) as representing S.) (forall naturals y) [((forall naturals x < y) phi(y:=x)) -> phi(y)] -> (forall naturals y) phi(y) Note that equivalent to induction is the least number principle: Every definable set of elements all satisfying natp has a least element. This concept generalizes from natural numbers to ordinals less than epsilon-0, as recognized in ACL2 by predicate o-p and ordering o<. Then an analogous equivalent is the least ordinal principle: Every set of elements all satisfying o-p has a least element in the sense of o<. But for purposes of understanding this talk, one may imagine that we are restricting induction to the natural numbers. First-order logic does not assume any induction axioms. However, our formalization of ACL2, below, restricts to theories T that do contain all induction axioms that are expressible in L(T). =============================================================================== 2. Correctness story for ACL2: Brief synopsis of the "Structured Theory" paper. That paper is: Matt Kaufmann and J Strother Moore, Structured Theory Development for a Mechanized Logic. Journal of Automated Reasoning 26, no. 2 (2001), pp. 161-203. http://www.cs.utexas.edu/users/moore/publications/acl2-papers.html#Foundations The problem: How do we specify correctness for the ACL2 system? Traditional logic does not address the idea evolving a theory with definitions, let alone encapsulate, include-book, or defchoose events. The short answer: - View the ACL2 logic as being first-order logic, where the only relation symbol is equality. If u is a term of ACL2, then we sometimes view u as the formula ~(u = nil). - Let GZ be the "ground-zero" theory present when starting up ACL2. - An ACL2 session S defines a theory T(S) extending GZ by the axiomatic events in the session (defun, encapsulate, and defchoose, but not defthm). [The paper calls this the "history of" S, "h(S)", and calls sessions "chronologies".] - Then here is a desired property: Every defthm of an ACL2 session S should be a theorem of T(S). What is involved in proving this result? - Basic idea: well, this is obvious if ACL2's reasoning engine is correct: then each theorem is, in fact, proved from the extension of GZ by the axiomatic events that precede it. - But local events complicate matters. Suppose for example that we certify the following book, where Main only mentions f, not g. (in-package "ACL2") (defun f ...) (local (defun g ...)) (local (defthm Lemma1 (... f ... g ...))) (local (defthm Lemma2 (... f ... g ...))) (defthm Main (... f ...) Let S be the session at the end of the proof pass of certify-book, where all of the above events are present. If later we include this book, then local events are skipped so the resulting session, S0, only contains this: (in-package "ACL2") (defun f ...) (defthm Main (... f ...) How do we know that Main is a theorem of this session? Simplistic Answer: T(S) is a conservative extension of T(S0). Why? Because it is an extension by an explicit definition (of g). But there are complications! - How about recursive definitions? In first-order logic, which has quantification, in a theory extending GZ, which has sequences and induction, we can eliminate the recursion with a standard trick (from Kleene?): In brief (very brief!), f(x) = y iff there is a legal computation tree from x that yields y. See the paper for issues regarding termination. - How about encapsulate? There, we just throw some stuff away, so an encapsulate is conservative by the Conservativity for Weakening proposition. - How about defchoose? - And how does induction fit into this story? =============================================================================== 3. Review of defchoose. Note: Defchoose is a primitive ACL2 event, which is used to implement quantification (via the defun-sk macro). See :doc defchoose for more about this event. The presentation here is intended to be self-contained. Example. (defstub p (x) t) (defchoose choose-larger-x-for-p (x) (y) (and (< y x) (p x))) This adds the following axiom. It says that if any X greater than Y satisfies P, then (CHOOSE-LARGER-X-FOR-P Y) is such an X. ACL2 !>:pf choose-larger-x-for-p (IMPLIES (AND (< Y X) (P X)) (LET ((X (CHOOSE-LARGER-X-FOR-P Y))) (AND (< Y X) (P X)))) ACL2 !> In general, defchoose adds what logicians call a "Skolem function". Consider the following form, where all variables of body are among the indicated bound and free variables, body is a term using only function symbols defined in the current session, and fn is not defined in the current session. (defchoose fn bound-var (free-var1 ... free-vark) body) The effect is to introduce a new function symbol, fn, with formal parameters list (free-var1 ... free-vark), along with the following axiom: (implies body (let ((bound-var (fn free-var1 ... free-vark))) body)) Defchoose actually allows more than one bound-var, in which case the axiom uses mv-let in place of let. See :doc defchoose for details. =============================================================================== 4. Conservativity of defchoose: introduction. Theorem. Let S be an ACL2 session. Then the extension of T(S) by a defchoose axiom is conservative. First, here is an incomplete proof that gives some insight. Fix an ACL2 session S and a defchoose event as above: (defchoose fn bound-var (free-var1 ... free-vark) body) By the Countable Model Conservativity proposition, it suffices to take a countable model M of T(S) and expand it to a model M' of the defchoose axiom A for fn, which is the universal closure of: (implies body (let ((bound-var (fn free-var1 ... free-vark))) body)) By countability of M we may choose an enumeration e of U^M, i.e., a function from the set of natural numbers onto U^M. We now define an expansion M' of M that interprets fn and satisfies the above defchoose axiom, A. Fix m1,...,mk \in U^M and let s0 be the assignment defined by mapping free-vari to mi (1<=i<=k). Now suppose there is some m \in M such that M |= body [s{bound-var := m}]. Pick j least such that this holds for m = e(j); then we define fn^M'(m1,...,mk) = m. If however no such m exists, let m = e(0). It is straightforward to check that M' |= A. So what is wrong with the above proof? ACL2 does proofs by induction. Why is it OK to do a proof by induction when the theorem involves fn? We need to modify our correctness story. Our spec for an ACL2 session S needs to require that T(S) is "closed under induction": Every induction axiom in the language of T(S) is a member of T(S). Briefly, an explicit definition preserves this property because the new function symbol can be eliminated from any formula in favor of the body of the definition. And, our argument for encapsulate based on the Conservativity for Weakening proposition goes through unchanged. But the argument above for defchoose is not sufficient for preserving induction. The end of :doc conservativity-of-defchoose gives an example making this issue concrete. So now we turn to a more careful way of expanding M so that the expansion not only satisfies the defchoose axiom, but also satisfies induction axioms that involve the new function symbol, fn. Technically, by "induction" we mean "epsilon-0 induction" (or in an ACL2 context, induction over the o< relation), though that distinction isn't important for understanding the argument that follows. =============================================================================== 5. Conservativity of defchoose, following :doc conservativity-of-defchoose. A concept called "forcing", originally introduced by Paul Cohen for set theory in 1963, has long since been adapted by logicians to a simpler form that is suitable for model theory. In this section we outline a model-theoretic forcing argument for building a suitable expansion M' of a structure M, as in the section above but this time satisfying induction, so as to prove the following: Theorem. Let T be an extension of GZ that is closed under induction, and let A be a defchoose axiom introducing a new function symbol fn. Then the extension of T by A together with induction axioms over the extension of L(T) by fn is a conservative extension of T. A proof may be found in :doc topic conservativity-of-defchoose, which replaces (the buggy) proof in Appendix B of the Structured Theory paper. That :doc also discusses an alternate but very similar argument to introduce an enumeration of the universe, from which the theorem above follows easily. Exercise: Flesh out that alternate argument. Here we sketch the main argument provided in that :doc topic, albeit omitting most proof details due to time considerations (not because they're difficult). Please see :doc conservativity-of-defchoose if you are interested in such details. We derive this theorem from the Countable Model Conservativity proposition, as outlined in the preceding section. So let M be a countable model of T that satisfies first-order induction. As before, we consider the following event. (We could allow more than one bound variable but we don't bother here.) (defchoose fn bound-var (free-var1 ... free-vark) body) We wish to show conservativity of the extension T1 of T by the universal closure of the defchoose formula (implies body (let ((bound-var (fn free-var1 ... free-vark))) body)) -- but now we also include all induction axioms involving fn into T1. As before, it suffices to fix a countable model M of T and find an expansion of M that satisfies T1. We define a partial order P consisting of all finite functions interpreting fn that witness the axiom above -- that is, the defchoose formula is only required to hold when the value of is in the domain of the function. P is ordered by subset, and we say that p2 "extends" p1 if p1 is a subset (not necessarily proper) of p2. CRITICAL OBSERVATION: P is definable in M. (The idea: use association lists to represent finite functions.) Forcing concepts (I'll draw pictures during the talk): A "dense" (or "cofinal") subset of P is a subset D of P such that for every p \in P, there is d \in D such that d extends p. A subset G of P is "generic" with respect to a collection Ds of subsets of P, also written "G is Ds-generic," if G is closed under subset (if p2 \in G and p2 extends p1 then p1 \in G), G is pairwise compatible (the union of any two elements of G is in G), and every set in Ds intersects G. Proposition 1. For any partial order P and countable collection Ds of dense subsets of P, there is a Ds-generic subset of P. -| Let Df be the set of dense subsets of P that are first-order definable (with parameters) in M. Df is countable (because M is countable, and then Goedel numbering all terms and then all formulas). Proposition (#5 in the :doc). The following is dense for any finite set S of k-tuples: {p \in P: S \subset domain(p)}. -| Choose G to be Df-generic, by Proposition 1. Let L be the language of T and let L[g] be the union of L with a set containing a single new function symbol, g. Let M[G] be the expansion of M to L[g] obtained by interpreting g to be the following function g0: g0(x1,...,xk) = y iff for some h \in G, (x1,...,xk) = y. Pairwise compatibility of G guarantees that g0 is indeed a function; and it is total by the Proposition just above and the fact that the indicated set is definable in M. It is easy to see that M[G] satisfies the above defchoose axiom. Where are we? M: A given countable model satisfying induction. P: Partial order of finite partial functions witnessing the defchoose axiom, definable in M. Df: The collection of all definable sets of M. G: A Df-generic subset of P. g0: The union of all functions in G. M[G]: The expansion of M that interprets g as g0, which as noted above satisfies the above defchoose axiom. So it remains to show that M[G] satisfies all of its induction axioms. Definition of forcing. Let phi(x1,...,xk) be a first-order formula in L[g] and let p \in P. We define a formula of L, denoted "p ||- phi" ("p forces phi"), by recursion on phi (in the metatheory) as follows. For phi atomic and not mentioning g: p ||- phi is just phi. [I'll draw a picture in the talk.] For phi atomic and mentioning g: p ||- phi is the formula asserting that for every g1 in P extending g there is some g2 in P extending g1 such that phi holds for g2, where we define g2 to return nil on arguments not in its domain. p ||- ~phi is: (forall p' \in P extending p) ~(p' ||- phi). p ||- phi_1 & phi_2 is: (p ||- phi_1) & (p ||- phi_2). p ||- (exists x) phi is: (exists x) (p ||- phi). We will need the following definition later. Definition. p ||-w phi (p "weakly forces" phi) is an abbreviation for p ||- ~~phi. See :doc conservativity-of-defchoose for some exercises suggested by John Cowles that will help in gaining familiarity with the definition of forcing, and for proofs of the following propositions. Proposition 2. For any p \in G and D \in Df, there is some q \in G extending p such that q \in D. -| Proposition 3. For any formula phi of L[g], the following is a dense subset of P, which (since it is first-order definable with parameters in M) intersects G. {p0: M |= ((p ||- phi) OR (p ||- ~phi))[s{p:=p0}]} -| Proposition 4. Let phi be a formula of L[g]. Suppose p0 \in P, p1 \in P, M |= (p ||- phi)[s{p:=p0}], and p1 extends p0. Then M |= (p ||- phi)[s{p:=p1}]. -| The next lemma tells us that the sentences true in M[G] are those that are forced by an element of G. It can be proved by induction on the structure of phi. Truth Lemma. Let phi be a formula in L[g], let s be an assignment to the free variables of phi, and let p be a variable not in the domain of s. Then M[G] |= phi[s] iff for some p0 \in G, M |= (p ||- phi)[s{p:=p0}]. -| Truth Lemma Corollary. The Truth Lemma holds with ||-w replacing ||-. -| Here is our main theorem. Recall that all first-order theories in our ACL2 context satisfy the e0-induction scheme. Theorem. M[G] satisfies e0-induction. Proof. We consider an arbitrary instance of e0-induction in L[g]. theta(y): (forall y) [(o-p(y) & ((forall x) (op(x) & x o< y) phi(y:=x))) -> phi(y)] -> (forall y) (o-p(y) -> phi(y)) Our goal is to prove that theta holds in M[G]. Suppose we can show that the following definable set D \subset P is dense in P. D = {p \in P: either p ||-w phi(y0) for every y0 satisfying o-p, or else for some y0 such that o-p(y0), p ||- ~phi(y0) and for every o-p y1 such that y1 o< y0, p ||-w phi(y1)}. Then since G is Df-generic, we may pick p0 \in G such that p0 \in D. By definition of D, there are two cases. Case 1: p0 ||-w phi(y0) for every y0 satisfying o-p. Then by the Truth Lemma Corollary, M[G] |= phi(y0) for every such y0, and thus M[G] |= theta. Case 2: Otherwise choose an o-p y0 such that p0 ||- ~phi(y0) and for every o-p y1 o< y0, p0 ||-w phi(y1). By the Truth Lemma and its corollary, since p0 \in G we have: (1) M[G] |= ~phi(y0). (2) For every o-p y1 o< y0, M[G] |= phi(y1). It follows that the antecedent of theta is false in M[G], as witnessed by y = y0; thus M[G] |= theta. So we have shown that M[G] |= theta provided D is dense in P. To prove the latter, suppose p0 \in P; we find p1 \in D extending p0. Case 1: p0 ||-w phi(y0) for every o-p y0. Then we may take p1 to be p0. Case 2: Otherwise choose an o-p y0 such that (by definition of ||-w) it fails that p0 ||- ~~phi(y0). By definition of ||- there is an extension p1 of p0 such that p1 ||- ~phi(y0). Without loss of generality, by applying e0-induction (actually the least ordinal principle) we may choose y0 to be the minimum o-p (in the sense of o<) for which there is such p1 ||- ~phi(y0). It suffices (by definition of dense) to show that p1 \in D. We already have p1 ||- ~phi(y0). Let o-p(y1) where y1 o< y0; it suffices to show p1 ||-w phi(y1), i.e. (by definition of ||-w) p1 ||- ~~phi(y1). So suppose q extends p1; it suffices to show q ||- ~phi(y1). But this is indeed the case since otherwise q and y1 would contradict the minimality of y0! So we have completed the remaining goal, which was to show that D is dense in P. -|