Search-engine friendly clone of the ACL2 documentation.

Guarantees-of-the-top-level-loop

Guarantees provided by top-level evaluation

We often refer to the ``top-level loop,'' or just ``the loop'' when the context is understood. The loop is the interactive read-eval-print loop with which the user interacts to issue commands, query the ACL2 logical world, test functions, conjectures, and systems, etc. This documentation topic mainly addresses the logical guarantees of the loop. For practical advice about interacting with and configuring the loop see ld. Ld is the ACL2 implementation of the loop (as a macro that calls a state-using function with many parameters which may be set by the user). Ld is called initially when ACL2 is fired up.

But here we are concerned with what it means, logically, when a term tm evaluates without error to a value v in the loop.

We do not formalize macroexpansion here, but rather, we deal with so-called “translated” terms, where macros and constants have been expanded away. We also do not discuss here how ACL2 evaluates expressions; see evaluation for such discussion. (Those interested in implementation issues may also read the long comment in ACL2 source file interface-raw.lisp that is labeled “Essay on Evaluation in ACL2”.)

See soundness for a more general discussion of soundness-related issues (including, for example, interrupts and trust tags).

A Strawman Proposal on the Meaning of Top-Level Evaluation

One might like to think that for any term, tm, a non-erroneous interaction like

ACL2 !>tm 
v

means that the formula

(equal tm 'v)

is a theorem and can be proved by the prover.

For example, let sq be a :logic mode function that squares its argument. (For the sake of a later example, let us also make the guard of sq be t and verify the guard by fixing (with rfix) the argument to be a rational, but those aspects of sq are irrelevant in our immediate use of it below.)

(defun sq (x) 
  (declare (xargs :guard t :verify-guards t)) 
  (let ((x (rfix x))) 
    (* x x)))

then the evaluation

ACL2 !>(sq 3) 
9

suggests that (equal (sq 3) 9) is a theorem and that is indeed provable by the ACL2 theorem prover.

But the situation is quite subtle and deserves further discussion. That is what this topic is about.

Spoiler Alert: We will not give logical semantics to every non-erroneous evaluation in this discussion. We only hint at how it can be done for many terms. Furthermore, even for those terms whose evaluations do correspond to theorems, the logical theory in which those theorems can be proved is not the theory supported by the ACL2 prover but an extension of it. We ultimately describe below the so-called evaluation theory which is an extension of ACL2's proof theory. So this documentation topic discusses two logical theories. Because any theory in which proofs are done might be called the ``proof theory'' and in this section we need to clearly distinguish between the two theories being discussed, we'll refer to the theory supported by the ACL2 theorem prover as the prover's theory.

Problems Raised by :Program Mode Functions

Suppose prog-sq is defined as a :program mode function that squares its argument. Then the top-level evaluation

ACL2 !>(prog-sq 3) 
9

clearly should not suggest

(equal (prog-sq 3) 9)

is a theorem. :Program mode functions are not defined in the prover's theory. (:Program mode functions may not terminate and any naive attempt to add their definitions to the prover's theory as axioms could result in unsoundness.) The top-level loop has the ability to evaluate :program mode expressions but it is ``magic,'' unsupported by proof. (Non-termination by the evaluator can't manifest unsoundness: such computations just never produce an answer.) The evaluation of :program mode expressions is supported in the loop primarily as a way to execute commands like defun and defthm. But it is also a convenience that allows ACL2 to be used as a programming language (e.g., to prototype a system one might eventually admit to the logic and prove theorems about).

Therefore, when attempting to define the logical meaning of top-level evaluations we might restrict our attention to :logic mode terms only. But this is not sufficient for several reasons.

:Program mode functions raise another problem discussed below in connection with apply$.

Problems Raised by Single-Threaded Objects

Evaluation of terms involving single-threaded objects or ``stobjs'' raise problems, even if the terms involved are in :logic mode. Here is a top-level evaluation sequence

ACL2 !>(defstobj st fld) 
... 
ST 
ACL2 !>(update-fld 3 st) 
<st> 
ACL2 !>(update-fld (sq (fld st)) st) 
<st> 
ACL2 !>(fld st) 
9

but the strawman conjecture from the last evaluation, (equal (fld st) 9), is not a theorem. However, a related formula is a theorem and can be proved by the prover.

(implies (stp st) 
         (let* ((st (update-fld 3 st)) 
                (st (update-fld (sq (fld st)) st))) 
           (equal (fld st) 3)))

This reminds us that the top-level loop tracks changes to each ``live'' stobj and records them in the ACL2 state. Those changes must be reflected in any suggested theorems.

Problems Raised by Apply$

Apply$ calls apply$-userfn to handle the application of user-defined function symbols. Apply$-userfn is undefined but might be constrained by warrant hypotheses (see defwarrant). Recall that warrants link the functions approved by defwarrant to their names and constrain (apply$ 'fn (list a1 ... an)) to be (fn a1 ... an) under certain tameness conditions. As noted at the bottom of the discussion of defwarrant, there is a model of the prover's theory in which all warrants issued by defwarrant are true. The top-level loop assumes that all those warrants are true. But the prover's theory does not, because explicit warrant hypotheses are essential to avoiding the so-called ``local problem'' (see Lesson 12 of introduction-to-apply$).

One might wonder why we have warrants at all, aside from the local problem. The key reason is that inconsistency can result from non-terminating recursion if we had the following axiom schema

Unsound Axiom Schema: 
(equal (apply$ 'f (list a1 ... an)) (f a1 ... an))

for every defined function f. See the russell example in introduction-to-apply$. So warrants and the restrictions enforced by defwarrant keep the prover's theory consistent. One might then ask ``how do we get away with letting the top-level loop apply$ all badged :program mode functions?'' The answer is simple: :program mode functions are not axiomatized so we don't have a theory to worry about!

To illustrate the difference between the behavior of the top-level loop and the prover, suppose the function sq, defined above, has been warranted.

(defwarrant sq)

and then consider the top-level evaluation

ACL2 !>(apply$ 'sq '(3)) 
9

This is possible because the warrant for sq is assumed true in the top-level loop and that warrant tells the loop that (apply$ 'sq '(3)) = (sq 3).

However, (equal (apply$ 'sq '(3)) '9) is not a theorem in the prover's theory. Instead, this is a theorem

(implies (warrant sq) 
         (equal (apply$ 'sq '(3)) '9))

where (warrant sq) is just a convenient abbreviation for (force (apply$-warrant-sq)).

Thus, to extend the strawman conjecture to functions in which apply$ is ancestral would require tracking the warrants relevant to the execution path (or punting and collecting all warrants) and amending the conjecture to add those warrants as hypotheses to the equality.

But apply$ introduces another source of complexity: mixed-mode functions (see mixed-mode-functions). Recall the :program mode function prog-sq which squares its argument. We can assign a badge to prog-sq and then define a :logic mode function that apply$s it!

(defbadge prog-sq) 
 
(defun logic-sq (x) 
  (declare (xargs :mode :logic)) 
  (apply$ 'prog-sq (list x)))

Since the top-level loop can evaluate applications of :program mode symbols we see the following evaluation.

ACL2 !>(logic-sq 3) 
9

Note that (logic-sq 3) is a :logic mode term and logic-sq can even be warranted with defwarrant. But neither

(equal (logic-sq 3) '9)

nor

(implies (warrant logic-sq) (equal (logic-sq 3) '9))

is a theorem in the prover's theory because prog-sq is undefined in that theory. For what it is worth, the prover can prove

(implies (warrant logic-sq) (equal (logic-sq 3) (apply$ 'prog-sq '(3))))

This example contradicts the suggestion, just above, that if we properly considered warrants we could give logical meaning to evaluations of :logic mode terms involving apply$.

Problems Raised by Constrained Functions

Constrained functions raise problems if they have been given attachments (see defattach). The top-level loop uses attachments to compute values consistent with the constraints, but the prover's theory does not. Attachments are useful for building and testing instances of models consistent with the constraints.

For example, let (nonneg-rat x) be constrained to return some nonnegative rational.

(encapsulate (((nonneg-rat *) => *)) 
  (local (defun nonneg-rat (x) 
           (declare (ignore x)) 
           0)) 
  (defthm nonneg-rat-constraint 
    (and (rationalp (nonneg-rat x)) 
         (<= 0 (nonneg-rat x)))))

It is impossible to evaluate (nonneg-rat 3) in the top-level loop. But we can attach another function to it provided we can prove that function always returns a nonnegative rational. Since sq, as defined above to rfix its argument as a rational, satisfies that constraint, we can attach sq to nonneg-rat.

(defattach nonneg-rat sq)

and then we can ``evaluate'' calls of nonneg-rat in the loop, getting results that are consistent with its constraint (but overly specific!).

ACL2 !>(nonneg-rat 3) 
9

But of course the strawman conjecture (equal (nonneg-rat 3) '9) is not a theorem of the prover's theory.

The Prover's Theory

The ACL2 logic formalizes an applicative (functional and side-effect free) extension of a subset of Common Lisp (see common-lisp). It is described in chapter 6 of Computer-Aided Reasoning: An Approach, by Kaufmann, Manolios, and Moore, as an extension of quantifier-free first-order logic with equality (section 6.1). The syntax is that of translated terms though the user is allowed to use a more flexible syntax which can be extended by defining constant symbols and macros (see defconst and defmacro). To see the formal translation of such a term use the command :trans. In section 6.2 of the book, axioms are added to characterize the basic data objects of numbers (integers, rationals, and complex rationals), characters, strings, symbols, and ordered pairs. In section 6.3, the ordinals up to epsilon-0 are constructed from ordered pairs and integers; a well-founded ``less than'' relation is also defined. But the book's construction of the ordinals is now obsolete. A new representation was implemented after the book was published. See ordinals. In Section 6.4, a conservative Principle of Definition (see defun and defuns or mutual-recursion) is introduced, allowing the addition of axioms defining new function symbols. In Section 6.5, a Principle of Induction up to epsilon-0 is described. Induction and recursion are duals: every admissible recursive function suggests an induction and vice versa and this duality is used by the ACL2 prover to select an often appropriate induction scheme for a conjecture. See induction and hints for ways to influence this selection. Finally, in section 6.6, the book introduces encapsulate to introduce constrained functions and functional instantiation (a derived rule of inference akin to second order instantiation), and a variety of other ways of conservatively adding new function symbols.

The ACl2 prover proves theorems in the extension of the above theory obtained by adding the axioms introduced by events successfully carried out in the user's session. The most common such axiom-adding events are defun, defchoose, and encapsulate, which all conservatively extend the theory by the addition of axioms about new function symbols. But defun is just a special case of defuns which adds a mutually recursive clique of new names (see also mutual-recursion). Defstobj and defabsstobj add recognizers, constructors, and accessors for single-threaded objects (aka ``stobjs'') but logically just use defun to add new functions and then syntactically restrict their use in code. Finally, we provide a means of adding an arbitrary formula as an axiom, defaxiom, but strongly discourage its use. There are many other events that add axioms, e.g., defun-nx and defun-sk but these are defined as macros that expand into the primitives just listed. In addition, there is a facility for including files of previously admitted events, include-book.

Note that an event of the form (skip-proofs EV) does not introduce any axioms other that those added by the event, EV. However, this skip-proofs event does assume that all proof obligations introduced by EV are indeed provable in the prover's theory.

Given a user's session, we call the above described theory the prover's theory.

Now recall the strawman proposal for what evaluation in the top-level loop means, i.e., that the evaluation

ACL2 !>tm 
v

might mean that the formula

(equal tm 'v)

is a theorem. We have illustrated with :program mode functions, uses of apply$, and of constrained functions that this strawman conjecture is not necessarily a theorem of the prover's theory!

However, there is a theory in which many of these conjectures are theorems.

The Evaluation Theory

The evaluation theory is obtained from the prover's theory as follows.

Instead of treating stobj names as variables, the evaluation theory treats them as abbreviations for the ``current value'' of the stobj, specifically, the constant obtained by composing the sequence of all updates to the stobj's fields carried out so far in the top-level loop. Thus, for example, in the evaluation theories created by this sequence of top-level evaluations
```
ACL2 !>(defstobj st fld)                           ; [1] 
ST 
ACL2 !>(update-fld 3 st)                           ; [2] 
<st> 
ACL2 !>(update-fld (* (fld st) (fld st)) st)       ; [3] 
<st> 
```
st is an abbreviation for (NIL) after evaluation [1], an abbreviation for (3) after evaluation [2], and an abbreviation for (9) after evaluation [3].
Every warrant created by defwarrant is assumed true as an axiom.
For every (defattach f g) event in the prover's theory the axiom (i.e., constraints) on f is replaced by the axiom (equal (f ...) (g ...)) in the evaluation theory. (Of course this replacement also takes place when using the more general form for defattach, (defattach ... (f g ...) ...).)

Given the restrictions enforced by defun, defstobj, defwarrant, apply$, encapsulate, and defattach, the evaluation theory is consistent if the prover's theory is consistent and free of defaxiom events. (Those interested in reading a proof of this claim are welcome to read the “Essay on Defattach” in ACL2 source file other-events.lisp.)

Furthermore, the top-level evaluation

ACL2 !>tm 
v

means that

(equal tm 'v)

is a theorem in the evaluation theory provided

tm is a :logic mode term, and
every symbol that reaches the first argument of any apply$-userfn in the execution of tm is in :logic mode.

The ACL2 prover cannot generally prove these theorems, since it operates in the prover's theory. However, as noted, there are often ways to encode the conjectures into ACL2 formulas that are provable — in particular, by adding suitable warrant hypotheses — though the ACL2 system does not provide tools for doing so.

Some Examples of Top-Level Evaluations

In this section we illustrate some consequences of the guarantee that for any term, tm, satisfying certain restrictions, if tm evaluates to a value v, then the term

(equal tm 'v)

is a theorem of the evaluation theory.

Each command below is enumerated and after the entire sequence we make observations about the results and implications. It might also help to read the section Top-Level Evaluation of Apply$ in the documentation for apply$, which describes how apply$ is implemented in the top-level loop.

ACL2 !>(include-book "projects/apply/top" :dir :system)
...
ACL2 !>(defun my-cons (x y)                                  ; [1]
          (declare (xargs :mode :program))
          (cons x y))
...
ACL2 !>(defbadge my-cons)                                    ; [2]

MY-CONS now has the badge (APPLY$-BADGE 2 1 . T) but has
no warrant.

 T
ACL2 !>(apply$ 'my-cons '(3 4))                              ; [3]
(3 . 4)
ACL2 !>(thm (equal (my-cons 3 4) '(3 . 4)))                  ; [4]
...
******** FAILED ********
ACL2 !>(verify-termination my-cons)                          ; [5]
...
ACL2 !>(apply$ 'my-cons '(3 4))                              ; [6]


ACL2 Error in TOP-LEVEL:  The value of APPLY$-USERFN is not
 specified on MY-CONS because MY-CONS has not been warranted.

ACL2 !>(defwarrant my-cons)                                  ; [7]

MY-CONS is now warranted by APPLY$-WARRANT-MY-CONS, with
badge (APPLY$-BADGE 2 1 . T).

ACL2 !>(apply$ 'my-cons '(3 4))                              ; [8]
(3 . 4)

ACL2 !>ACL2 !>(thm (equal (my-cons 3 4) '(3 . 4)))           ; [9]
...
******** FAILED ********
ACL2 !>(thm                                                  ;[10]
        (implies (warrant my-cons)
                 (equal (my-cons 3 4) '(3 . 4))))
...
Q.E.D.
...
Proof succeeded.

In events [1] and [2] my-cons is introduced as a badged :program mode function. Command [3] shows that we can apply$ 'my-cons in the top-level loop and get the ``expected'' value. This is ``magic'' on several levels! There is no axiom in the evaluation theory defining my-cons and there is no warrant connecting the symbol 'my-cons to the function my-cons. The failure of the first thm command, [4], shows that the prover's theory knows nothing of the behavior of apply$ on 'my-cons. After converting my-cons to :logic mode with verify-termination in [5] we can no longer evaluate the application of my-cons at the top-level, as shown by [6]! This is surprising because one might expect ACL2 to be able to do more with a :logic mode function than its :program mode counterpart. But in the evolving evaluation theory as of [6], my-cons is a :logic mode function with no warrant and so logically speaking (apply$ 'my-cons ...) is undefined. Computing a value for it at [6] would violate the guarantee that values of :logic mode terms can be proved correct in the evaluation theory — i.e., that the computed value can be derived logically. Such a derivation is impossible without a warrant. However, after defwarrant, in [7], confirms that it is sound to issue a warrant (linking the symbol 'my-cons to the function my-cons via apply$), the evaluation of the 'my-cons term in [8] succeeds. The defwarrant soundly extends the proof theory by adding a definition for the warrant function for my-cons and extends the evaluation theory by assuming the warrant for my-cons is true. This latter extension is also sound. However, [9] shows that we still cannot prove (in the prover's theory) that the evaluation of the apply$ term is correct, because we did not provide the warrant as a hypothesis. However, if we amend the conjecture to include the warrant, as in [10], the proof (in the prover's theory) succeeds. While all warrants are assumed true in the evaluation theory, they must be made explicit as hypotheses for proofs in the prover's theory. This is the mechanism whereby ACL2 can avoid the ``local problem'' illustrated in Lesson 12 of introduction-to-apply$ and discussed more thoroughly in ``Limited Second-Order Functionality in a First-Order Setting'' by Matt Kaufmann and J Strother Moore.