J Strother Moore

March, 2018

`"PLTP"`

). But if you merely want to see how PLTP(A) is coded in
ACL2 and what it can prove, browse here.
- ACL2
source code for PLTP(A). PLTP(A) is implemented as a collection
of
`:PROGRAM`

mode function and macro definitions in ACL2. A brief ``users manual'' is included in the comments at the top of the file. - Standard
Proveall obtained by combining the definitions in the
`[/ DEFS]`

file of 14 September, 1973, with the`[THEOREMS]`

file of that same date, both of which can be found in Listing I. The definitions and theorems here have been transcribed from POP-2's square-bracket list notation to standard Lisp notation. Furthermore, in the case of theorems using PLTP's numeric variables, additional`NUMBERP`

hypotheses have been added as explained below. The output of PLTP(A) on this proveall (in ACL2 Version 8.0) is given in`standard-proveall.output.txt`

. - JACM
Proveall obtained by combining the definitions and theorems from
Appendices A and B of
Proving Theorems about LISP Functions,
R.S. Boyer and J S. Moore.
*Journal of the Association for Computing Machinery*,**22**(1), 1975, pp.129-144. Many of the theorems in the JACM paper have the same formulas as theorems in the standard proveall, but some of the definitions in the JACM paper differ from their counterparts in the standard proveall. The output of PLTP(A) on this proveall (in ACL2 Version 8.0) is given in`jacm-proveall.output.txt`

. - A discussion of the failures in the standard proveall. Two of the theorems in the standard proveall cannot be proved by PLTP(A) and have been commented-out of the scripts. In this discussion we argue that those theorems could not possibly have been proved by the 1973 PLTP either and were in the proveall as challenges to work on. Part of this argument involves proving those theorems in ACL2 itself (not PLTP(A)) and those ACL2 proof scripts are included in this discussion.

The sources for The Edinburgh Pure Lisp Theorem Prover, aka PLTP, as of 14 September, 1973, are in Listing-F and Listing-H, and have been transcribed into text form in listings-f-and-h.txt.

PLTP was written in 1972 and 1973. It was programmed in POP-2, a now dead imperative programming language, and ran on an ICL 4130 processor with 64K bytes of 2 microsecond memory shared between 8 interactive jobs controlled by teletype input from the users. File input and output was by paper tape and lineprinter.

PLTP(A) is an applicative Common Lisp implementation of the PLTP proof
techniques and heuristics as of 14 September, 1973. In fact, PLTP(A) is
coded in ACL2, a descendent of PLTP. PLTP(A) is *not* a faithful
translation of the 1973 code because the known
bugs and oddities of PLTP's original code
have been repaired and some of the algorithms have been implemented
differently while preserving a high-level equivalency to the original code
as discussed further below.

PLTP(A) supports the same basic Pure Lisp as PLTP did, with
primitives `NIL`

, `CAR`

,
`CDR`

, `CONS`

, `EQUAL`

and `IF`

.
(Some versions of PLTP used the symbol `COND`

instead
of `IF`

. PLTP(A) uses `IF`

internally but defines
PLTP's 3-place `COND`

to be `IF`

and can optionally
print `IF`

-expressions as
`COND`

-expressions.)

PLTP(A) supports PLTP's abbreviation conventions of `T`

(denoting `(CONS NIL NIL)`

) and natural numbers (denoting lists
of `NIL`

). PLTP's term representation left abbreviations in
place. E.g., while exploring a term the 1973 theorem prover could have encountered both the number
`0`

and its formal meaning, `NIL`

. This invited
programming errors, as when the 1973 `EVAL`

transformed `(CDR `

*n*`)`

to
*n*-1, for all natural numbers *n*, because we (Boyer and Moore)
forgot to check that *n* is not `0`

, resulting in the
transformation of
`(CDR 0)`

to the non-term `-1`

.

In contrast, PLTP(A) translates abbreviated terms into their formal
correspondents at input so that the core theorem prover only sees formal
terms. This not only makes its term exploration paradigms more regular but
eliminates a variety of bugs and oddities such as PLTP's failure to always
treat `T`

, `1`

, and `(CONS NIL NIL)`

the
same even though all three denote the same term.

PLTP(A) ``untranslates'' before printing so printed terms contain the
familiar abbreviations. This can be controlled by various ``feature switches'' set by
the user. See the definition of `set-feature`

in the PLTP(A) source code.

POP-2's 8-character limitation on names produced many obscure names, e.g.,
`GENRLT1`

. I have changed the names of many implementation functions, e.g.,
`GENRLT1`

is here called `generalizable-subterms`

and
`GENRLIZE`

is now called `generalize`

. When the new
name lacks obvious correspondence to the old name (as
with `GENRLT1`

), a comment in the code gives the old name.

PLTP(A) does not attempt to reproduce exactly PLTP's English explanations of the proof steps. (The main reason for this is that it requires cluttering the code with output routines and for PLTP(A) I chose to make the generation of output as inconspicuous as possible to focus on the clarity of the term manipulation code.) But PLTP(A) takes the same steps (subject to the caveats below), describes each step in English, and prints the same intermediate formulas (again, subject to caveats below).

No longer concerned with speed and memory footprint, I've freely recoded the algorithms to return the ``same'' results, often by more elegant means. But by ``same'' I mean the proof steps taken by PLTP(A) are the same as those by the 1973 PLTP. Every intermediate formula produced by PLTP(A) logically entails the corresponding formula produced by PLTP; in fact, if one allows for variable renaming and abbreviations, corresponding formulas are usually identical. For example, where PLTP might have generated and printed

```
[COND [NUMBERP GENRL1] [EQUAL [LENGTH GENRL1] GENRL1] [CONS NIL NIL]]
```

PLTP(A) might generate and print the variant (due to variable renaming):
```
(COND (NUMBERP GENRL2) (EQUAL (LENGTH GENRL2) GENRL2) (CONS NIL NIL))
```

or one of several equivalent abbreviations for that formula depending on the feature switches:
```
(COND (NUMBERP GENRL2) (EQUAL (LENGTH GENRL2) GENRL2) T)
```

or
```
(IF (NUMBERP GENRL2) (EQUAL (LENGTH GENRL2) GENRL2) T)
```

or
```
(IMPLIES (NUMBERP GENRL2) (EQUAL (LENGTH GENRL2) GENRL2)).
```

See the definition
of `set-feature`

for more explanations.
But I wrote ``usually identical'' above! There are two exceptions to this formula-by-formula identity-modulo-abbreviations: the output of PLTP(A)'s fertilization routine is a generalization of PLTP's, and the output of PLTP(A)'s induction routine is always a list-based inductive formula and never one tailored explicitly to Peano naturals. We discuss these two discrepancies, and some minor ones, below.

The important deviations from the implementation used in PLTP are as follows:

*Term Representation*: As previously noted, the internal form of terms in PLTP(A) is different: abbreviations have been translated away. This changes the case analysis of virtually every routine that explores terms — and eliminates several bugs.*Type information*: PLTP contained code to determine the ``type'' of an expression. In the most general case, it actually created a new function, called the ``type function,'' to describe the output of a given function,*fn*. Its analysis was context free (insensitive both to the hypotheses governing the occurrence of the call of*fn*in question and insensitive to the actuals passed to*fn*) and was based on the type functions of the subfunctions used in the body of*fn*. It generated type functions on the fly during proofs, whenever it needed type information. It then stored the generated type function on the property list of*fn*for future use. Thus, PLTP routines that required type information side-effected the logical world.I chose not to do this in PLTP(A) because it would have meant that every term-processing routine in PLTP(A) that accessed type information would have to return two results, the new term and the (possibly) modified world containing updated type information.

Instead, every time the user introduces a function in PLTP(A), its type function is generated then and there and stored. This makes the prover routines simpler and more valuable as pedagogical illustrations of the term processing techniques.

Because PLTP's type function generator was context free, the type functions generated by PLTP(A) at definition-time are the same as those PLTP generated at proof-time. (The above claim of equivalence is not entirely accurate because of the next deviation!)

*Numeric Variable Symbols*: PLTP, as of 14 September, 1973, provided the notion of numerically typed variable symbols. Recall that in PLTP's logic, the naturals were constructed, Peano-like, from`NIL`

by successively`CONS`

ing`NIL`

onto them. Such lists of`NIL`

were recognized by the predicate`NUMBERP`

.Variable symbols (which at the time we called

*Skolem constants*, from our days of resolution theorem proving) whose names began with the letters between`I`

and`N`

were understood to be numerically valued. Such symbols were recognized by`ISNUMSKO`

in the POP-2 source code for PLTP. Theorems involving such numeric variables were thought of as implicitly carrying additional hypotheses constraining those variables to satisfy PLTP's`NUMBERP`

predicate.PLTP's type inference mechanism assumed any numeric variable was numerically valued. Furthermore, it would infer and store the output type of a defined function from its body.

But this meant that

*if*the PLTP user defined, say, the identity function with a numeric formal, PLTP would have stored the type information that its output was numeric. But PLTP did not provide any enforcement of the implied type system!For example the PLTP user could have written

`[DEFINE [FOO [LAMBDA [N] N]]]`

and it would be typed as numeric. Then, using the stored type function, PLTP could have proved the invalid formula`(NUMBERP (FOO A))`

. But nothing would prevent the PLTP user from then calling`FOO`

on a non-number and the result would be that non-number (since`FOO`

here is the identity function), contrary to the theorem just mentioned.The PLTP user &mdash and there were only two! &mdash could avoid this unsoundness simply by never using numeric variables as formals in definitions. Indeed, no definition in the standard regression suite or the JACM paper used numeric formals, so perhaps we recognized this problem at the time. (It should be understood that PLTP was an experimental research tool for investigations into induction and we tolerated many such implicit preconditions so we could get on with the research.)

Numeric variables were also supported in induction: instead of proving

`[`

with its normal simple induction scheme for lists:*p*N][AND [

PLTP used a scheme tailored to numerically typed variables:*p*NIL] [IMPLIES [*p*N] [*p*[CONS N1 N]]]][AND [

*p*0] [IMPLIES [*p*N] [*p*[ADD1 N]]]].*aka*[AND [

*p*NIL] [IMPLIES [*p*N] [*p*[CONS NIL N]]]].In creating PLTP(A), I therefore had the choice of either (a) enforcing a type system or (b) removing the whole notion of numeric variables. I chose the latter. PLTP(A) does

*not*treat variables starting with the letters`I`

through`N`

any differently than other variables and it never does a numeric induction.Theorems in the two PLTP(A) provealls that contained numeric variables have been modified here to carry explicit

`NUMBERP`

hypotheses.*Evaluation and Induction*: PLTP's`EVAL`

used side effects and about a dozen global variables to accumulate information to be used by the induction heuristic, including a so-called ``bomb list'' for each unexpanded recursive function call recording ``why'' that call could not be expanded. When PLTP needed to symbolically evaluate a term, it called`EVAL`

and ignored the side-effected globals; when PLTP needed to induct, it called`EVAL`

and ignored the result but inspected the final value of the global`ANALYSIS`

which contained the bomb lists.As with the code for type inference, mimicking this behavior in ACL2's functional setting would have produced hard-to-comprehend code because PLTP(A)'s functional

`eval`

would have to return both the simplified term and a modified list of global variables.Instead, I split PLTP's

`EVAL`

into two ACL2 functions, one for computing the result and another for computing the side-effects. The PLTP(A)`eval`

does not accumulate anything on the side, making it easier to see how PLTP did symbolic evaluation.Then, to implement induction in PLTP(A), I re-construct the value of PLTP's

`ANALYSIS`

, with a separate function, called`induction-analysis`

.`Induction-analysis`

makes it clear what information induction uses.While the 1973 PLTP implementation makes it obvious that induction is driven by the ``failures'' of the symbolic evaluator, PLTP(A)'s implementation makes it easier to understand how symbolic evaluation works and what information induction uses.

*Fertilization*: ``Fertilization'' was introduced in PLTP and describes the idea of using an equality induction hypothesis by selective substitution*and then throwing the hypothesis away*. However, in an effort to have our cake and eat it too, PLTP did not actually throw the hypothesis away: it merely hid it. It hid it by introducing a new function symbol of no arguments that was ``defined'' to be the negation of the equality and then disjoining a call of that new function to the formula resulting from throwing away the equality. This meant the output of the fertilization routine was equivalent to the input, which in turn meant that fertilization could freely recur root-and-branch through a formula fertilizing at will.The problem with this approach is that it is that these new functions are not proper definitions: their bodies involve variables not among their formals. And if one extended the idea by adding those variables as formals to the new symbols, they would block the subsequent inductions that PLTP did.

PLTP(A) avoids the on-the-fly introduction of dubious definitions by deleting the equality after using it, which generalizes the formula. This is what NQTHM and ACL2 do. But this means that PLTP(A) cannot recur freely through the formula looking for equalities to use. It only fertilizes conjuncts of the top-level goal.

No published proof by PLTP required any other form of fertilization.

*New Variable Names*: As noted in bugs and oddities PLTP made certain implicit assumptions on user input that made it easier to generate ``new'' variable names by just taking on successively higher suffixes. While these implicit assumptions are never violated in the known regression runs, I chose to eliminate that implicit assumption in PLTP(A) and to generate variable names that are guaranteed to be new. This means PLTP(A) sometimes generates different variable names than PLTP did. For example, in the proof of`[T 1 4]`

(page 10 of Listing J) we see PLTP generating the name`GENRL1`

the first time it generalizes and then generating`GENRL2`

and`GENRL3`

the second time. But when PLTP(A) is given the same conjecture to prove, it generates`GENRL1`

the first time it generalizes and`GENRL1`

and`GENRL2`

the second time*because the previously generated*. While this discrepancy could be avoided, doing so needlessly complicates the name generation algorithm because it would require passing around the history of the proof attempt, not just the current goal.`GENRL1`

no longer appears in the conjecture and so can be reused*Maximal Common Subterms*: PLTP used fairly messy code to find the largest common subterms on opposites sides of equalities and implications; PLTP(A), on the other hand, just computes all the subterms of the two sides, intersects them, and then throws out any subterm contained in another. Whether this algorithm returns the list in the same order — or even whether the original algorithm really returned the alleged list of ``largest common subterms'' — is irrelevant here: PLTP(A)'s algorithm returns a list of the largest common subterms, its correctness is obvious, speed is irrelevant given the small problem size and today's resources, and the order of the subterms shouldn't matter because all of them are then generalized to new variables. If the subterms are listed in a different order then that will affect the names of the new variables assigned to each, giving rise to the ``variable renaming'' caveat above. But that shouldn't affect the proofs.*The ``Waterfall''*: I refactored PLTP's main proof routine,`PROVE1`

to better expose what became*the waterfall*in our subsequent provers.PLTP's

`PROVE1`

explicitly called`EVALUATE`

(which is just`EVAL`

after initializing the globals used to accumulate induction analysis),`REWRITE`

, and`REDUCE`

successively and looped until the result stablized. While this process was called ``normalation'' in the 14 September, 1973, PLTP, in subsequent versions of PLTP and our other provers the comparable process was called*simplification*.Once the formula was simplified, PLTP called a routine that Boyer and I facetiously named

`ARTIFINTEL`

(for ``artificial intelligence'' because according to a comment in the POP-2 code it was`THE SMARTEST PROGRAM IN THE THEOREM PROVER`

because it only works on the first conjunct).`ARTIFINTEL`

fertilized, generalized, and inducted on the first conjunct in the formula it was given. PLTP did not support elimination of destructors or of irrelevance, which were introduced later.In PLTP(A), I wrap the

`EVALUATE`

/`REWRITE`

/`REDUCE`

loop up into a single function named`simplify`

, and I just open-code`ARTIFINTEL`

so`fertilize`

,`generalize`

, and`induct`

are called explicitly by`prove1`

.PLTP(A) preserves PLTP's use of a single formula to represent the problem. For example, if simplification splits the problem into two subgoals, the result is an

`IF`

-expression representing their conjunction and fertilize, etc., just work on one conjunct at a time. It wasn't until NQTHM that Boyer and I started splitting out goals as clauses collected and implicitly conjoined in the*pool*at the bottom of the waterfall.

The small sizes of the definitions and conjectures presented to PLTP was wonderfully freeing given today's resources!

The changes in implementation also reflect lessons learned from 45 years of coding provers together with the much more robust ACL2 programming environment I enjoy today.

`books/projects/pltpa/`

subdirectory of
your ACL2 directory, and finally we tell you how to get ACL2 if you don't
have it.
To play with PLTP(A), do

(include-book "projects/pltpa/pltpa" :dir :system)and then select the PLTP symbol package and initialize the PLTP(A) logical world with

(pltpa)Then you can define PLTP functions and submit challenge theorems as with

(define (append ; define a PLTP function (lambda (x y) (if x (cons (car x) (append (cdr x) y)) y)))) (prove assoc-of-append ; name is irrelevant except for io; theorems are not ; stored in the PLTP world (equal (append (append a b) c) (append a (append b c))))For a complete list of available commands, e.g., to inspect PLTP's database, or undo a definition, see the comment at the top of the PLTP(A) source file, which may be found in

`"books/projects/pltpa/pltpa.lisp"`

of your local ACL2
installation.
You can find all of the PLTP(A) files on the `books/projects/pltpa/`

subdirectory of your ACL2 directory.

Among the interesting files after certification of the books there you will find:

`pltpa.lisp`

— the source file for PLTP(A)`pltpa.acl2`

— the definition of the PLTP symbol package`standard-proveall.lsp`

— the PLTP(A) commands in its standard proveall regression suite`chk-standard-proveall.lisp`

— an ACL2 book that is certifiable iff PLTP(A) passes the standard proveall`chk-standard-proveall.cert.out`

— PLTP(A)'s output on the standard proveall (see note below)`jacm-proveall.lsp`

— the PLTP(A) commands in its JACM proveall regression suite`chk-jacm-proveall.lisp`

— an ACL2 book that is certifiable iff PLTP(A) passes the JACM proveall`chk-jacm-proveall.cert.out`

— PLTP(A)'s output on the JACM proveall (see note below)

`.cert.out`

files above: Ordinarily, an
ACL2 book can be certified provided the ACL2 prover can successfully prove
every theorem in it. So how do we make a book that ACL2 can certify iff
PLTP(A) can prove the theorems in its own regression? That involves a
certain amount of infrastructure. The two `.cert.out`

files above
contain the output of ACL2's certification process. That output starts and
ends with ``noise'' related to ACL2's certification infrastructure.
PLTP(A)'s output (e.g., on its standard proveall) may be found between the
block of text that starts with
:STANDARD Proveall by PLTP(A)and the block that starts with

Successful :STANDARD Proveall by PLTP(A) completed onOf course, if you have a running PLTP(A) you can see the output of the provealls by evaluating

`(proveall :standard)`

and ```
(proveall
:jacm)
```

.
If you do not already have ACL2 then you must obtain and install it in order
to run PLTP(A). This also involves installing one of several available
Common Lisp implementations. Follow the installation instructions under
the *Obtaining, Installing, and License* link on the
ACL2 Home Page.

The *Easy Install for Unix-like Systems* instructions will fetch not
only the ACL2 system source code but also all of the Community Books. It is
easier to fetch them all than to fetch just what you need for PLTP(A). But
because PLTP(A) does not depend on any other books, you can skip the general
certification instructions (e.g., Step 5 of the *Easy Install for Unix-like
Systems*) if all you want to do is play with PLTP(A). Instead, just build your
`saved_acl2`

runnable image and then certify the PLTP(A) books by
connecting to `books/projects/pltpa`

in your local ACL2
installation and running:

../../build/cert.pl --acl2 ../../../saved_acl2 *.lisp