Conservativity of the Defchoose Event
Matt Kaufmann
ACL2 seminar, 10/18/06

OUTLINE OF TALK:

0. Preliminaries and notation.

[[5 minutes for this part; 5 minutes cumulative]]

1. Review of first-order logic (fast!)
   a. Syntax
   b. Semantics
   c. Theories, completeness, and soundness
   d. Conservative extension
   e. Induction

[[15 minutes for this part; 20 minutes cumulative]]

2. Correctness story for ACL2: Brief synopsis of the "Structured Theory"
   paper.

[[10; 30]]

3. Review of defchoose.

[[10; 40]]

4. Conservativity of defchoose: introduction.

[[10; 50]]

5. Conservativity of defchoose, following :doc conservativity-of-defchoose.

[[20; 70]]

===============================================================================

0. Preliminaries and notation.

During the talk I will only briefly skim the "Review of first-order logic"
below.

Goal: Discuss conservativity of the defchoose event.

But that's only of interest if necessary background and context are provided.

This file is an abbreviated version of the full notes (11 or 12 pages),
available from the seminar page.  The talk will follow this file, which is a
bit sloppy but more suitable for a talk; a more careful treatment is in the
full notes.

I hope to serve three potential audiences today, each with its own goals:

(1) Get a sense of the logical issues underlying ACL2, but stop there.
(2) As above, but learn enough to read the full notes.
(3) Grasp it all; no need to read the full notes, perhaps interested in
    fleshing out details onesself.

[[5 min. elapsed]]

===============================================================================

1. Review of first-order logic (fast!)

Reference: Any introductory mathematical logic book.

----------------------------------------

   a. Syntax

- Terms
- Atomic formulas
- (First-order) Formulas: contain atomic and closed under ~, &, exists.
- Disjunction and forall are abbreviations.

Sentence: no free variables.  Obtained via universal closure.

[[8 min. elapsed]]

----------------------------------------

   b. Semantics

Structure (interpretation) M consists of:
- Universe U^M
- Interpretation f^M for each function symbol f

Example: M = (Reals, plus, mult, zero, one)
U^M = Reals
+^M = plus
*^M = times

Today: No relation symbols except =.

L(M) = {f1,f2,...,P1,P2,...}, the set of function and relation symbols
interpreted in M.

Assignment s into structure M: A map of variables into the set U^M.

  s(v)

Extends to s^M, "eval" for terms:

  s^M(term): s(f(t1,...,tn)) = f^M(s^M(t1),...,s^M(tn))

Let s{x:=m} be s except that x is mapped to m.

M |= A [s], A a formula, s an assignment:

  M |= P(t1,...,tn) [s] iff <s^M(t1),...,s^M(tn)> \in P^M.

  M |= P1 & P2 [s] iff both M |= P1 [s] and M |= P2 [s].

  M |= (exists x)P [s] iff for some m \in U^M, M |= P [s{x:=m}].

Now consider the case of two structures that agree except that one interprets
more symbols.  Example:

M1 = (Reals, plus)
M2 = (Reals, plus, times)

More formally:

M2 is an "expansion of" M1 if they have the same universe, and M2 may interpret
extra symbols (i.e., L(M1) is a subset of L(M2)), but M1 and M2 agree on
interpretations of the symbols of L(M1).  Then M1 is a "reduct of" M2 (to
L(M1)).

M |= A means M |= A[{}] for A a sentence.  "M is a model of A."

S \subset U^M is "definable" ("first-order definable with parameters from M")
when for some formula phi and assignment s:

  S = {a \in U^M: M |= phi[s{x:=a}].

Example: Definability of the set of primes in the usual structure for Peano
Arithmetic (Nat, plus, times).

Let s(w) = 1.
Primes = {a \in U^M: M |= (forall y) (forall z) (y*z = x -> (y=w | z=w)) [s{x:=a}]}

Finally, we say that a structure M is "countable" if its universe U^M is
countable, meaning that there is some enumeration {e(0), e(1), e(2), ...} of
U^M.

[[11 min. elapsed]]

----------------------------------------

   c. Theories, completeness, and soundness

T: Set of (first-order) formulas

L(T): function and relation symbols occurring in T

Theory: Set of (first-order) sentences.

Proof: Syntactic notion (derive from axioms using rules like modus ponens:
from P and P->Q infer Q).  Not defined here.

T |- A: There is a proof of A using axioms from T.

"M is a model of T" or "M |= T" means: M |= A for all A in T.

T |= A [overloading "|="]: For every M, if M |= T then M |= A.

Soundness Theorem:     If T |- A then T |= A.

Completeness Theorem:  If T |= A then T |- A.

[[14 min. elapsed]]

----------------------------------------

   d. Conservative extension

For T1 \subset T2, T2 is "a conservative extension of" (or "conservatively
extends") T1 if for every sentence A of L(T1):

  [T2 |- A] implies [T1 |- A].

Proposition (Countable Model Conservativity).  Suppose T1 \subset T2 are
theories and that every countable structure M1 with language L(T1) that
satisfies T1 has an expansion to a structure M2 with language L(T2) that
satisfies T2.  Then T2 is a conservative extension of T1.  -|

Picture:

M1 = (U, f, g, ....) |= T1
M2 = (U, f, g, ...., h,...) |= T2

We observe that explicit definitions provide conservative extensions:

Corollary.  Let T1 be a theory and let T2 be the result of extending T1 by the
universal closure of a sentence

  f(x1,...,xn) = u

where u is a term of L(T1) with free variables among the distinct variables
x1,...,xn and f \notin L(T1).  Then T2 is a conservative extension of T1.

Proof: By the first proposition above, since if M |= T then we can interpret f
as the function mapping m1,...,mn to s^M(u), where s is the function mapping xi
to mi (1<=i<=n).  -|

In fact a sequence of explicit definitions forms a conservative extension, by
the following trivial lemma.

Proposition (Transitivity of Conservative Extension).  If T2 is a conservative
extension of T1, which is a conservative extension of T0, then T2 is a
conservative extension of T0.  -|

The following is also trivial, and used below.

Proposition (Conservativity for Weakening).  Suppose T0 \subset T1 \subset T2
and T2 conservatively extends T0.  Then T1 conservatively extends T0.  -|

[[17 min. elapsed]]

----------------------------------------

   e. Induction

We consider traditional strong induction.  Here, phi(y:=x) denotes substitution
of x for y in phi.

        (forall naturals y) [((forall naturals x < y) phi(y:=x)) -> phi]
     -> (forall naturals y) phi

Equivalent (i.e., the above all hold in a structure if and only if:): Every
definable set of elements all satisfying natp has a least element.

A detail not important for understanding the ideas:
Actually we consider epsilon-0 ordinals in place of naturals.

        (forall o-p y) [((forall o-p x o< y) phi(y:=x)) -> phi]
     -> (forall o-p y) phi

Induction axioms are not built into first-order logic.

But we restrict to theories T that contain all induction axioms that are
expressible in L(T).

[[20 min. elapsed]]

===============================================================================

2. Correctness story for ACL2: Brief synopsis of the "Structured Theory"
   paper.

Structured Theory Development for a Mechanized Logic.  JAR 2001.
[ACL2 home] -> "Books and Papers" link -> "Foundations".

Problem: Specify correctness for the ACL2 (with extension by definitions,
encapsulate, include-book, and defchoose).

Short answer:

- First-order logic.

- GZ: the "ground-zero" theory present when starting up ACL2.

- An ACL2 session S defines a theory T(S) extending GZ by the axiomatic events
  in the session (defun, encapsulate, and defchoose, but not defthm).

- Then: Every defthm A of an ACL2 session S should be a theorem of T(S):
  T(S) |- A.

Seems clear if prover only proves theorems.  But how about LOCAL?

Example: Certify-book to get session S0, where Main only mentions f, not g.

  (in-package "ACL2")
  (defun f ...)
  (local (defun g ...))
  (local (defthm Lemma1 (... f ... g ...)))
  (local (defthm Lemma2 (... f ... g ...)))
  (defthm Main (... f ...))

Now undo and then include the above book, obtaining session S:

  (in-package "ACL2")
  (defun f ...)
  (defthm Main (... f ...))

How do we know that Main is a theorem of T(S)?

Simplistic Answer: T(S) is a conservative extension of T(S0).  Why?  Because it
is an extension by an explicit definition (of g).

But there are complications!

- How about recursive definitions?  Idea: f(x) = y if there is a legal
  computation of f on x that yields y, where "legal computation" is first-order
  definable, roughly, by:  (forall n) step[n+1] = single-step(step[n]).

- How about encapsulate?  There, we just throw some stuff away, so apply
  Conservativity for Weakening proposition.

- How about defchoose?

- And how does induction fit into this story?

[[30 min. elapsed]]

===============================================================================

3. Review of defchoose.

Also see :doc defchoose and, for an application, :doc defun-sk.

Example.

  (defstub p (x) t)

  (defchoose choose-larger-x-for-p (x) (y)
    (and (< y x) (p x)))

The result (I'll talk us through this):

  ACL2 !>:pf choose-larger-x-for-p
  (IMPLIES (AND (< Y X) (P X)) ; or, think of (exists x (AND (< Y X) (P X)))
           (LET ((X (CHOOSE-LARGER-X-FOR-P Y)))
                (AND (< Y X) (P X))))
  ACL2 !>

In general,

     (defchoose g bound-var (free-var1 ... free-vark)
       body)

introduces a new function symbol (g free-var1 ... free-vark) and axiom:

     (implies body ; or, think of (exists bound-var body)
              (let ((bound-var (g free-var1 ... free-vark)))
                body))

[Use mv-let instead of let if there is more than one bound-var.]

[[40 min. elapsed]]

===============================================================================

4. Conservativity of defchoose: introduction.

Theorem.  Let S be an ACL2 session.  Then the extension of T(S) by a defchoose
axiom is conservative.

Proof sketch.  Start with ACL2 session S and a defchoose event:

  (defchoose g bound-var (free-var1 ... free-vark)
    body)

Goal (by Countable Model Conservativity): Take a given countable M |= T(S) and
expand to M' that satisfies:

     (implies body
              (let ((bound-var (g free-var1 ... free-vark)))
                body))

So we need to interpret g.  "Countable" gives us a map e from Naturals onto
U^M: U^M = {e(0), e(1), e(2), ...}.  Suppose body holds for free-vars and some
bound-var.  Pick the "least" bound-var, i.e. bound-var = e(m) for the least m
that satisfies body.  That will be g^M'(free-var1,...,free-vark).

Problem: ACL2 does proofs by induction.  Why is it OK to do a proof by
induction when the theorem involves g?

Correctness story should require of session S that T(S) is "closed under
induction": Every induction axiom in the language of T(S) is a member of T(S).

- Explicit definitions are OK: eliminate f(args) by the body of f.
- Encapsulate is OK; again, we're just weakening.
- Defchoose: not so clear how to get induction.  See last part of :doc
  conservativity-of-defchoose for an illuminating example.

[[50 min. elapsed]]

===============================================================================

5. Conservativity of defchoose, following :doc conservativity-of-defchoose.

Forcing: Proof technique; Paul Cohen, 1963; used for independence results in
set theory

Model-theoretic forcing: simpler, also well-known to logicians

Theorem.  Let T be an extension of GZ that is closed under induction, and let A
be a defchoose axiom introducing a new function symbol g.  Then the extension
of T by A together with induction axioms over the extension of L(T) by g is a
conservative extension of T.

See :doc conservativity-of-defchoose for details (avoid buggy proof in Appendix
B of the Structured Theory paper).

Exercise: Prove that such T can be extended to T', also closed under induction,
with an axiom saying that e is an enumeration of natp onto the universe.  Hint:
It's very similar to the proof in :doc, which is sketched below.

Now, to sketch the proof....

As above, start with countable M |= T.  This time we arrange that our expansion
of M satisfies not only the defchoose axiom

     (implies body
              (let ((bound-var (g free-var1 ... free-vark)))
                body))

but also induction, even for formulas mentioning g.

Define a partial order, P, as follows (abusing notation a bit):

- Contains all finite functions p "interpreting g": if
  p(x1,...,xk) = y and body(x1,...,xk,z) holds for some z, then
  body(x1,...,xk,y) holds.
- P is ordered by subset: when p1 \subset p2 we say "p2 extends p1".

CRITICAL OBSERVATION: P is definable in M.  (The idea: use association lists to
represent finite functions.)

Forcing concepts (I'll draw pictures during the talk):

  A "dense" (or "cofinal") subset of P is a subset D of P such that for every p
  \in P, there is d \in D such that d extends p.

  A subset G of P is "generic" with respect to a collection Ds of subsets of P,
  also written "G is Ds-generic," if G is closed under subset (if p2 \in G and
  p2 extends p1 then p1 \in G), G is pairwise compatible (the union of any two
  elements of G is in G), and every set in Ds intersects G.

Proposition 1.  For any non-empty partial order P and countable collection Ds
of dense subsets of P, there is a Ds-generic subset of P.  -|

Let Df be the set of dense subsets of P that are first-order definable (with
parameters) in M.  Df is countable (because M is countable, and then Goedel
numbering all terms and then all formulas).

Choose G to be Df-generic, by Proposition 1.

Proposition 5. The following is dense for any finite set S of k-tuples: {p \in
P: S \subset domain(p)}.  -|

Let L be the language of T and let L[g] be the union of L with a set containing
a single new function symbol, g.  Let M[G] be the expansion of M to L[g]
obtained by interpreting g to be the following function g0:

  g0(x1,...,xk) = y iff for some h \in G, h(x1,...,xk) = y.

Pairwise compatibility of G guarantees that g0 is indeed a function; and it is
total by the Proposition just above and the fact that the indicated set is
definable in M.

It is easy to see that M[G] satisfies the above defchoose axiom.

Where are we?

M:    A given countable model satisfying induction.

P:    Partial order of finite partial functions witnessing the defchoose axiom,
      definable in M.

Df:   The collection of all definable subsets of M that are dense in P.

G:    A Df-generic subset of P.

g0:   The union of all functions in G.

M[G]: The expansion of M that interprets g as g0, which as noted above
      satisfies the above defchoose axiom.

So it remains to show that M[G] satisfies all of its induction axioms.

Definition of forcing.  Let phi(x1,...,xk) be a first-order formula in L[g] and
let p \in P.  We define a formula of L, denoted "p ||- phi" ("p forces phi"),
by recursion on phi (in the metatheory) as follows.

  For phi atomic and not mentioning g: p ||- phi is just phi.

  [I'll draw a picture in the talk.]  For phi atomic and mentioning g: p ||-
  phi is the formula asserting that for every g1 in P extending g there is some
  g2 in P extending g1 such that phi holds for g2, where we define g2 to return
  nil on arguments not in its domain.

  p ||- ~phi is: (forall p' \in P extending p) ~(p' ||- phi).

  p ||- phi_1 & phi_2 is: (p ||- phi_1) & (p ||- phi_2).

  p ||- (exists x) phi is: (exists x) (p ||- phi).

We will need the following definition later.

Definition.  p ||-w phi (p "weakly forces" phi) is an abbreviation for
p ||- ~~phi.

The next lemma tells us that the sentences true in M[G] are those that are
forced by an element of G.  It can be proved by induction on the structure of
phi.

Truth Lemma.  A given sentence phi is true in M[G] iff for some p in G,
p ||- phi.  More formally:

Let phi be a formula in L[g], let s be an assignment to the free variables of
phi, and let p be a variable not in the domain of s.  Then the following are
equivalent.

  M[G] |= phi[s]

  For some p0 \in G, M |= (p ||-  phi)[s{p:=p0}].

  For some p0 \in G, M |= (p ||-w phi)[s{p:=p0}].  -|

Here is our main theorem.  Recall that all first-order theories in our
ACL2 context satisfy the e0-induction scheme.

Theorem.  M[G] satisfies e0-induction.

Proof.  We consider an arbitrary instance of e0-induction in L[g].

       theta(y):   (forall y) [(o-p(y) &
                                ((forall x) (op(x) & x o< y) phi(y:=x)))
                            -> phi(y)]
                -> (forall y)  (o-p(y) -> phi(y))

Our goal is to prove that theta holds in M[G].  Suppose we can show that the
following definable set D \subset P is dense in P.

       D = {p \in P: either p ||-w phi(y0) for every y0 satisfying o-p,
                     or else
                     for some y0 such that o-p(y0),
                         p ||- ~phi(y0) and
                         for every o-p y1 such that y1 o< y0,
                             p ||-w phi(y1)}.

Then since G is Df-generic, we may pick p0 \in G such that p0 \in D.  By
definition of D, there are two cases.

Case 1: p0 ||-w phi(y0) for every y0 satisfying o-p.  Then by the Truth Lemma,
M[G] |= phi(y0) for every such y0, and thus M[G] |= theta.

Case 2: Otherwise choose an o-p y0 such that p0 ||- ~phi(y0) and for every o-p
y1 o< y0, p0 ||-w phi(y1).  By the Truth Lemma, since p0 \in G we have:

     (1)   M[G] |= ~phi(y0).
     (2)   For every o-p y1 o< y0, M[G] |= phi(y1).

It follows that the antecedent of theta is false in M[G], as witnessed by y =
y0; thus M[G] |= theta.

So we have shown that M[G] |= theta provided D is dense in P.  To prove the
latter, suppose p0 \in P; we find p1 \in D extending p0.

Case 1: p0 ||-w phi(y0) for every o-p y0.  Then we may take p1 to be p0.

Case 2: Otherwise choose an o-p y0 such that (by definition of ||-w) it fails
that p0 ||- ~~phi(y0).

By definition of ||- there is an extension p1 of p0 such that p1 ||- ~phi(y0).
Without loss of generality, by applying e0-induction (actually the least
ordinal principle) we may choose y0 to be the minimum o-p (in the sense of o<)
for which there is such p1 ||- ~phi(y0).

It suffices (by definition of dense) to show that p1 \in D.  We already have
p1 ||- ~phi(y0).  Let o-p(y1) where y1 o< y0; it suffices to show
p1 ||-w phi(y1), i.e. (by definition of ||-w) p1 ||- ~~phi(y1).

So suppose q extends p1; it suffices to show q ||- ~phi(y1).

But this is indeed the case since otherwise q and y1 would contradict the
minimality of y0!  So we have completed the remaining goal, which was to show
that D is dense in P.  -|