Bugs and Oddities in the 14 September, 1973, PLTP Source Code


Robert S. Boyer
J Strother Moore
August, 2017

Acknowledgement

We deeply appreciate the help we received from Grant Olney Passmore in assembling this website. He should by rights be a co-author of this ``bug report.'' But because the source being critiqued is entirely that of Boyer and Moore, the document is easier to write in the first person!

Background

Re-reading our source code with the benefit of 45 years of experience in coding theorem provers there are many things we'd do differently if we were doing it again! The PLTP code of 14 September, 1973, has a few bugs that could have caused it to ``prove'' non-theorems even when used in ways we imagined it should be used. In addition, it failed to check or enforce some implicit preconditions that, if violated, could cause other problems. And we find some of the design decisions and stylistic conventions perplexing and needlessly messy by our coding standards today.

These criticisms are somewhat mitigated by three observations.

Having said all that, we now critique the 14 September, 1973, version of PLTP. The definitive sources for that version of the code is provided in Listing-F and Listing-H. However, because those are hard-to-search OCR pdf files, we have transcribed the contents of those listings into a simple text file henceforth referred to as listings-f-and-h. We assume you can use your favorite text editor to search the listings and/or find the lines mentioned by number.

A Perplexing Design Choice that Permeates the Code

In PLTP's logical theory there were five primitives functions symbols, CAR, CDR, CONS, EQUAL, and COND, and there was one constant, NIL.

T abbreviated (CONS NIL NIL). 0 abbreviated NIL, 1 abbreviated (CONS NIL 0), 2 abbreviated (CONS NIL 1), etc.

PLTP did not check the well-formedness of input terms and did not expand abbreviations away before beginning the theorem proving process. Thus, for example, the code might encounter T, 1, (CONS NIL NIL), (CONS 0 NIL), etc., all of which are different concrete representations of the same logical term.

This convention of allowing multiple concrete representations of the same abstract term invites coding mistakes and several of the bugs below are due to this.

FUNCTION EVAL (line 128)

In the code for handling X = "CDR", where we say THEN Y - 1; (line 170) we are trying to implement such transformations as the evaluation of (CDR 3) to 2, by replacing (CDR y) by y-1 when y is a number. But we don't know that y is non-0. Thus, the definition of EVAL will evaluate (CDR 0) to -1, which is not actually a term. We should have considered the possibility that y is 0 here and returned NIL since 0 and NIL represent the same term and we see, from the Y=NIL case, that (CDR NIL) is NIL.

An oddity (but not bug) in EVAL is the code:


  RETURN;
  STEPCNT + 1 -> STEPCNT;
  GOTO CONDRULES;
Since the RETURN exits the function, the two lines after it are dead code.

FUNCTION ISREALLINK (line 424)

This is not so much a bug as an oddity. In ISREALLINK we use ISINTEGER to recognize numbers representing lists of NILs. But everywhere else (nine times) we use the more general ISNUMBER to recognize such terms. Note: There are ten occurrences of ISNUMBER in the code, but the occurrence in GENSYM is not concerned with a term.

It would have been nice to see the same idiom used throughout the code.

By the way, the name ``ISREALLINK'' stems from the POP-2 convention of calling a cons-pair a link. In the parlance of ACL2, this function is meant to recognize explicit values, variable-free terms using only constructors (i.e., CONS here) and constants (NIL here).

FUNCTION ISNUMSKO (line 438) and All Its Callers

All uses of ISNUMSKO are suspect because they implement the assumption that ``numeric variables'' may only be instantiated with numerically valued terms, but the logical language is untyped.

ISNUMSKO recognizes POP-2 words beginning which a letter between I and N. These are considered ``numeric'' variables. For example, if the system is trying to prove (p N) then it might do an induction suitable for natural numbers: (AND (p 0) (IMPLIES (p N) (p (ADD1 N)))). This would be ok if one thought of oneself as proving (IMPLIES (NUMBERP N) (p N)) which, indeed, we imagined.

However, because the language is untyped one can still imagine problems. Suppose one defined the successor function, (DEFINE (SUCC (N) (CONS NIL N))) with the numeric formal parameter N. Then SUCC would be considered always numerically valued and that fact would be stored on the property list of SUCC. It would have been possible to then prove (NUMBERP (SUCC (CONS T T))) which is unfortunate since it evaluates to NIL!

This contradiction would not be possible if one prohibited function definitions from having numeric formal parameters like N. Indeed, no definition in the standard proveall or the JACM paper used numeric formal parameters. Furthermore, PLTP never instantiated previously proved theorems. So we believe that from a practical perspective, this bug never manifested itself.

But we consider the whole idea of numeric variables suspect without a proper type system!

FUNCTION ISCONS (line 446)

This function is supposed to return 1 (POP-2's true truthvalue) if the term is a CONS, i.e., has CONS as its top-level function symbol, and 0 (false) otherwise. But this code treats all numbers as representing CONSes. But 0 does not represent a CONS.

FUNCTION APPSUB1 (line 480)

APPSUB is the function that that applies a substitution to a term and APPSUB1 does all the work. But this function does not just visit the subterms, it visits and possibly replaces the function symbols too.

This is not necessarily a problem and, indeed, problems can be avoided if (a) the set of symbols allowed as function symbols in the theory were disjoint from the set of symbols used as variables and (b) well-formedness were checked when new definitions and theorems are submitted. But PLTP does not make such checks.

Suppose the user defined a predicate named X, and imagine that some conjecture to be proved by induction involved both the function X and the variable X. The base case is formed by using APPSUB to replace X by NIL in the conjecture to be proved, and the induction conclusion is formed by replacing X by (CONS X1 X). But if the conjecture were (X X) this would create the nonsensical non-formula:


(AND (NIL NIL)
     (IMPLIES (X X) ((CONS X1 X) (CONS X1 X)))).

Fortunately, PLTP's provealls never introduced a function symbol whose name was ever used as a variable symbol. So this bug never manifested itself.

Furthermore, the induction mechanism actually exploits APPSUB's ability to replace function symbols. The FUNCTION PICKINDVARS uses APPSUB to rename all occurrences of CAR and CDR to new symbols to avoid confusing CAR- and CDR-terms in the input formula with CAR- and CDR-terms introduced by evaluation.

FUNCTION TYPEEXP1 (line 652)

This function takes a term, TERM, and returns an expression in the variable X that recognizes the output of the term. For example, if TERM is (CONS 1 V) the returned expression is a term equivalent to (AND X (EQUAL (CAR X) 1)). When TERM is the call of some function, fn, for which no type function has been stored, TYPEEXP1 recursively computes the type expression for the body of fn, stores an appropriate type function with name fnTYPE, and returns (fnTYPE X).

There are two problems.

The case ELSEIF FUNSYM = "CONS" dealing with TERM being a CONS-expression is incorrect when it checks NUMERIC(TERM) (line 665). It should also check that TERM is an explicit value, i.e., (ISREALLINK TERM). Otherwise, TERM might be a term like (CONS NIL (TIMES A B)) which is a numerically valued term beginning with CONS. Since TYPEEXP1 generates the definition of a type function whose formal parameter is X, this bug could result in a type function whose definition contains free variables, e.g., A and B in the example just given.

The second problem is that TYPEEXP1 does not check that the newly introduced fnTYPE is a new function symbol. It will inadvertently redefine that function (line 696) if the user has previously defined a function of that name. An obvious correction is to enforce a pre-condition that no user function name ends with TYPE. None of the definitions in the provealls violate this pre-condition.

FUNCTION GENSYM (line 779)

GENSYM is used to generate ``new'' variable symbols. It does so by taking an existing symbol (TOPWORD) and tacking on an integer ``suffix,'' as when it transforms the root (TOPWORD) symbol X to X1. For any given root symbol, the first suffix it uses in a given proof is 1, the next time in that proof it uses 2, etc.

The problem with this is that it assumes that the user did not included any suffixed variables in the original conjecture. That is, the code is written as though the only suffixed variables are those introduced by GENSYM. If the user submitted a conjecture containing the variable symbol X1 and GENSYM is called with root symbol X, it will generate X1 but that would not be a new variable.

No variable used in our provealls is suffixed. So this potential problem never manifested itself. The problem could be fixed by simply enforcing a pre-condition that no user-submitted term contained suffixed variables.

The Value of RIDCARCDR (line 1864)

The replacement of CAR and CDR mentioned in the above discussion of APPSUB1 is done by applying the substitution RIDCARCDR whose value is set by this assignment:

CONSPAIR("CAR","CARARG")::(CONSPAIR("CDR","CDRARG")::NIL)
 -> RIDCARCDR;
So CARARG and CDRARG are assumed to be function symbols of arity 1 that do not occur in any user formula. This is another implicit pre-condition that ought to be enforced: CARARG and CDRARG are never defined or used by user. While no change was caused by this, the following is worth noting: in the EQUAL case of FUNCTION EVAL there is an odd sequence: because the RETURN rips control out. Thus, the last two statements above are dead code.