Symbolic Simulation: An ACL2 Approach

J Strother Moore

FMCAD '98

The full paper refers to an ACL2 script containing an ACL2 model of a ``small machine.''

The ``small machine'' model discussed in this paper was first developed in 1991 for a course on how to use the Boyer-Moore theorem prover, Nqthm, to model microprocessors. The Nqthm model is a distillation of the microprocessor modeling approach developed by the Nqthm community at the University of Texas at Austin and Computational Logic, Inc., in the 1980s. The small machine model was transcribed to ACL2 in 1995 and described in the paper:

- R. S. Boyer and J S. Moore, Mechanized Formal
Reasoning about Programs and Computing Machines,
in R. Veroff (ed.),
*Automated Reasoning and Its Applications: Essays in Honor of Larry Wos*, MIT Press, 1996.

Thus, there are two ACL2 scripts formalizing the small machine, the 1996 one and the present one (1998). They are different! The differences stem from the fact that the 1996 model was not Common Lisp compliant. In order to do the performance measuring reported in the present paper, I decided to change the model to make (and prove) it compliant. I list the differences between the two models below.

- In the non-compliant script, the function
`statep`

merely recognizes true lists of length 5, whereas in the compliant script, it recognizes syntactically well-formed states:(defun statep (x) (declare (xargs :guard t)) (and (true-listp x) (equal (len x) 5) (pcp (pc x)) ; pc = (symbol . nat) (stkp (stk x)) ; stk = (pc ... pc) (memp (mem x)) ; mem = (int ... int) (codep (code x)))) ; code = ((symbol ins ... ins) ; ... ; (symbol ins ... ins))

where an`ins`

is a syntactically well-formed instruction. Note in particular that the 1998`statep`

requires that the memory contain integers. The main reason for this has to do with the guard for the arithmetic instructions in the instruction set. The`ADD`

instruction, for example, adds the contents of two memory locations. For that to be compliant, we must either*know*or*check*that the two values are numeric. If we check, we slow down the execution and we have to invent the semantics of`ADD`

on non-numeric values. We can avoid these two burdens by choosing the guard on memory. That, however, requires that we prove that every operation that changes memory writes integers into it. - Memory is written with the function
`put`

. What happens if we`put`

beyond the end of memory? In the non-compliant script,`put`

deposits`nil`

s into the slots between the end of represented memory and the location to be written. That behavior violates our new guard on memory, so`put`

was changed to deposit 0s into those slots. In a realistic processor we would enforce the additional guard that no instruction writes beyond the end of represented memory. This was done both in the CAP proofs by Brock and the Piton proofs by me. But I chose not to deal with such resource bounds in this small machine. Analogous to`put`

is`get`

. In the 1996 model, we used the Lisp function`nth`

to retrieve memory contents -- and`nth`

retrieves`nil`

s beyond the end of memory. In the new model we use`get`

and define it to return 0s in that situation. Occurrences of`nth`

used to access memory in the old model were replaced by`get`

in the new. - In the non-compliant script, the
`JUMPZ`

instruction uses the ACL2 test`zp`

which coerces the tested value to a natural. This is artificial in the new model, where the tested value is known to be an integer. So I changed`JUMPZ`

to use`zerop`

, which tests simple equality to 0. This, of course, changes the semantics of the instruction set, but makes it more realistic. - In a change unrelated to guards, I modified the way I handled the
clock function
`cplus`

and`ctimes`

. In the 1996 model they were defined recursively to be + and * on naturals, using the Peano definitions. That way, they would suggest inductions and expand in the way needed to unwind loops in induction arguments. I proved two lemmas ``revealing'' that they were actually + and *, but kept those lemmas disabled except where I wanted to eliminate the clock functions for closed arithmetic expressions. This approach suffers the problem that if you try to compute with`cplus`

and`ctimes`

on ``large'' input, e.g.,`(cplus 1000 1)`

, as done in the clock expressions for my performance testing, you get stack overflows because of the recursive definitions. I avoid this in the new script by defining these two functions to be their natural arithmetic expressions. I then prove rewrite rules (actually`:definition`

rules) that cause them to expand as though they were recursively defined. And I provide`:induction`

rules so they suggest the appropriate inductions. Then I disable them. Enabling them is analogous to enabling the ``revelation'' rules of old. The neat thing about this approach is that I can now compute with simple arithmetic rather than recursion. - Several of the correctness theorems were changed. The new version
of
`statep`

insures that memory contents are integral. It was therefore unnecessary to write certain`(natp i)`

expressions; they were replaced with`(<= 0 i)`

. Also, some theorems were generalized slightly. For example the`TIMES`

program multiplies a natural times an integer, not just a natural times a natural. The program was not changed; I just proved a stronger theorem. No new lemmas were necessary.