Major Section: ACL2 Documentation

Bob Boyer and Warren Hunt have developed a canonical representation for ACL2 data objects and a function memoization mechanism to facilitate reuse of previously computed results. This facility includes procedures to read and print ACL2 expressions in such a way that repetition of some ACL2 objects is eliminated, thereby permitting a kind of on-the-fly file compression. The implementation does not alter the semantics of ACL2 except to add a handful of definitions.

We given the name ``ACL2(h)'' to the resulting experimental extension of the ACL2 system, which includes hash cons, function memoization, and fast association lists (applicative hash tables). It is optimized for Clozure Common Lisp (CCL), but other ANSI-compliant host Lisp implementations may also work, provided there are sufficient machine resources.

An easy way is to build ACL2(h) is to include the following with a `make`

command:

ACL2_HONS=hSo for example, to make an executable image and also documentation (which will appear in subdirectories

`doc/EMACS`

and `doc/HTML`

):
make large DOC ACL2_HONS=h

### CLEAR-HASH-TABLES -- deprecated feature

### CLEAR-MEMOIZE-STATISTICS -- clears all profiling info displayed by

`(`

`memoize-summary`

`)`

### CLEAR-MEMOIZE-TABLE -- forget values remembered for the given function

### CLEAR-MEMOIZE-TABLES -- forget values remembered for all the memoized functions

### CONS-SUBTREES --

`(cons-subtrees x nil)`

builds a fast alist that associates each subtree### FAST-ALIST-FREE --

`(fast-alist-free alist)`

throws away the hash table associated with a fast### FAST-ALIST-FREE-ON-EXIT -- Free a fast alist after the completion of some form.

### FAST-ALIST-LEN --

`(fast-alist-len alist)`

counts the number of unique keys in a fast alist.### FAST-ALIST-SUMMARY --

`(fast-alist-summary)`

prints some basic statistics about any current fast### FAST-ALISTS -- alists with hidden hash tables for faster execution

### FLUSH-HONS-GET-HASH-TABLE-LINK -- deprecated feature

### HONS --

`(hons x y)`

returns a normed object equal to`(cons x y)`

.### HONS-ACONS --

`(hons-acons key val alist)`

is the main way to create or extend### HONS-ACONS! --

`(hons-acons! key val alist)`

is an alternative to`hons-acons`

that### HONS-ASSOC-EQUAL --

`(hons-assoc-equal key alist)`

is**not fast**; it serves as the logical`hons-get`

.### HONS-CLEAR --

`(hons-clear gc)`

is a drastic garbage collection mechanism that clears out### HONS-COPY --

`(hons-copy x)`

returns a normed object that is equal to X.### HONS-COPY-PERSISTENT --

`(hons-copy-persistent x)`

returns a normed object that is equal to X`hons-clear`

.### HONS-EQUAL --

`(hons-equal x y)`

is a recursive equality check that optimizes when parts of### HONS-EQUAL-LITE --

`(hons-equal-lite x y)`

is a non-recursive equality check that optimizes if### HONS-GET --

`(hons-get key alist)`

is the efficient lookup operation for### HONS-NOTE -- notes about HONS, especially pertaining to expensive resizing operations

### HONS-RESIZE --

`(hons-resize ...)`

can be used to manually adjust the sizes of the hash### HONS-SHRINK-ALIST --

`(hons-shrink-alist alist ans)`

can be used to eliminate "shadowed pairs"### HONS-SHRINK-ALIST! --

`(hons-shrink-alist! alist ans)`

is an alternative to`hons-shrink-alist`

### HONS-SUMMARY --

`(hons-summary)`

prints basic information about the sizes of the tables in### HONS-WASH --

`(hons-wash)`

is like`gc$`

but can also garbage collect normed### MAKE-FAST-ALIST --

`(make-fast-alist alist)`

creates a fast-alist from the input alist,`alist`

itself or, in some cases, a new object equal to it.### MEMOIZE -- turn on memoization for a specified function

### MEMOIZE-SUMMARY -- display all collected profiling and memoization table info

### MEMSUM -- display all collected profiling and memoization info

### NORMED -- Normed objects are ACL2 Objects that are "canonical" or "unique" in a

### NUMBER-SUBTREES --

`(number-subtrees x)`

returns the number of distinct subtrees of X, in the`equal`

### RESTORE-MEMOIZATION-SETTINGS -- restore the saved memoization settings

### SAVE-AND-CLEAR-MEMOIZATION-SETTINGS -- save and remove the current memoization settings

### SLOW-ALIST-WARNING -- warnings issued when

`fast-alists`

are used inefficiently### UNMEMOIZE -- turn off memoization for the specified function

### WITH-FAST-ALIST --

`(with-fast-alist name form)`

causes`name`

to be a fast alist for the`form`

.### WITH-STOLEN-ALIST --

`(with-stolen-alist name form)`

ensures that`name`

is a fast alist at the`form`

. At the end of execution, it ensures that`name`

is a fast alist if and only if it was originally. That is, if`name`

was not a fast alist originally, its hash table link is freed, and if`form`

, that table is restored. Note that any extended table created from`form`

must be manually freed.

Much of the documentation for the remainder of this topic is taken from the paper ``Function Memoization and Unique Object Representation for ACL2 Functions'' by Robert S. Boyer and Warren A. Hunt, Jr., which has appeared in the Sixth International Workshop on the ACL2 Theorem Prover and Its Applications, ACM Digital Library, 2006.

In the implementation of the ACL2 logic, ACL2 data objects are represented by
Common Lisp objects of the same type, and the ACL2 pairing operation is
internally implemented by the Common Lisp `cons`

function. In Common
Lisp, `cons`

is guaranteed to provide a new pair, distinct from any
previously created pair. We have defined a new ACL2 function `HONS`

that
is logically identical to the ACL2 `cons`

function, but whose
implementation usually reuses an existing pair if its components are
identical to the components of an existing pair. A record of ACL2 HONS
objects is kept, and when an ACL2 function calls `hons`

ACL2 searches for
an existing identical pair before allocating a new pair; this operation been
called ``hash consing''.

It appears that hash consing was first conceived by A. P. Ershov in 1957, to speed up the recognition of common subexpressions. Ershov showed how to collapse trees to minimal DAGs by traversing trees bottom up, and he used hashing to eliminate the re-evaluation of common subexpressions. Later, Eiichi Goto implemented a Lisp system with a built-in hash consing operation: his h-CONS cells were rewrite protected and free of duplicate copies, and Goto used this hash consing operation to facilitate the implementation of a symbolic algebra system he developed.

Memoizing functions also has a long history. In 1967, Donald Michie proposed using memoized functions to improve the performance of machine learning. Rote learning was improved by a learning function not forgetting what it had previously learned; this information was stored as memoized function values.

The use of hash consing has appeared many times. For instance, Henry Baker used hash consing to improve the performance of the well-known Boyer rewriting benchmark. Baker used both hash consing and function memoization to improve the speed of the Takeuchi function, exactly in the spirit of our implementation, but without the automated, system-wide integration we report here.

The ACL2 implementation permits memoization of user-defined functions. During execution a user may enable or disable function memoization on an individual function basis, may clear memoization tables, and may even keep a stack of memoization tables. This facility takes advantage of our implementation where we keep one copy of each distinct ACL2 data object. Due to the functional nature of ACL2, it is sufficient to have at most one copy of any data structure; thus, a user may arrange to keep data canonicalized. This implementation extends to the entire ACL2 system the benefits enjoyed by BDDs: canonicalization, memoization, and fast equality check.

We have defined various algorithms using these features, and we have observed, in some cases, substantial performance increases. For instance, we have implemented unordered set intersection and union operations that operate in time roughly linear in the size of the sets. Without using arrays, we defined a canonical representation for Boolean functions using ACL2 objects. We have investigated the performance of rewriting and tree consensus algorithms to good effect, and we believe function memoization offers interesting opportunities to simplify algorithm definition while simultaneously providing performance improvements.

We recommend that users focus at first on the logical definitions of
`hons`

and other primitives rather than their underlying Common Lisp
implementations. Integrated with this memoization system is a performance
monitoring system, which can provide real-time feedback on the operation and
usefulness of `hons`

and function memoization. For a more detailed
description of these tools, please see the ACL2 2006 workshop paper mentioned
above.

The Fibonacci function is a small example that demonstrates the utility of function memoization. The following definition exhibits a runtime that is exponential in its input argument.

(defun fib (x) (declare (xargs :guard (natp x))) (mbe :logic (cond ((zp x) 0) ((= x 1) 1) (t (+ (fib (- x 1)) (fib (- x 2))))) :exec (if (< x 2) x (+ (fib (- x 1)) (fib (- x 2))))))

Below we show how the ACL2 `time$`

utility can measure the time to
execute a call to the `fib`

function (with some editing to avoid noise from
the underlying Common Lisp implementation). `Memoize`

is actually an
ACL2 macro that expands to a call of the ACL2 function used to identify a
function for memoization; see memoize. This function also accepts a
well-formed term that must be true in order for the system to memoize a call
of the memoized function; to ensure that an instance of the term is safe for
evaluation in Common Lisp, a check is performed that if the guard of the
memoized function is satisfied, then this instance will execute without
error.

ACL2 !>(time$ (fib 40)) ; (EV-REC *RETURN-LAST-ARG3* ...) took ; 0.99 seconds realtime, 0.98 seconds runtime ; (1,296 bytes allocated). 102334155 ACL2 !>(memoize 'fib) Summary Form: ( TABLE MEMOIZE-TABLE ...) Rules: NIL Time: 0.01 seconds (prove: 0.00, print: 0.00, other: 0.01) Summary Form: ( PROGN (TABLE MEMOIZE-TABLE ...) ...) Rules: NIL Time: 0.01 seconds (prove: 0.00, print: 0.00, other: 0.01) FIB ACL2 !>(time$ (fib 40)) ; (EV-REC *RETURN-LAST-ARG3* ...) took ; 0.00 seconds realtime, 0.00 seconds runtime ; (2,864 bytes allocated). 102334155 ACL2 !>(time$ (fib 100)) ; (EV-REC *RETURN-LAST-ARG3* ...) took ; 0.00 seconds realtime, 0.00 seconds runtime ; (7,024 bytes allocated). 354224848179261915075 ACL2 !>(unmemoize 'fib)

We see that once the function `fib`

is identified as a function for which
function calls should be memoized, the execution times are substantially
reduced. Finally, we can prevent `fib`

from being further memoized; in
fact, `unmemoize`

erases the previously memoized values.

The implementation of hash consing, memoization, and applicative hash tables involves several facets: canonical representation of ACL2 data, function memoization, and the use of Lisp hash tables to improve the performance of association list operations. We discuss each of these in turn, and we mention some subtle interrelationships. Although it is not necessary to understand the discussion in this section, it may permit some users to better use this implementation. This discussion may be confusing for some ACL2 users as it makes references to Lisp implementation features.

The ACL2 system is actually implemented as a Lisp program that is layered on top of a Common Lisp system implementation. ACL2 data constants are implemented with their corresponding counterparts in Common Lisp; that is, ACL2 cons pairs, strings, characters, numbers, and symbols are implemented with their specific Common Lisp counterparts. This choice permits a number of ACL2 primitive functions to be implemented with their corresponding Common Lisp functions, but there are indeed differences. ACL2 is a logic, and as such, it does not specify anything to do with physical storage or execution performance. When ACL2 is used, the knowledgeable user can write functions to facilitate the reuse of some previously computed values.

Recall the three mechanisms under discussion: hash consing, function
memoization, and fast association list operations. The function memoization
mechanism takes advantage of the canonical representation of data objects
provided by the `hons`

operation as does the fast association list
operation mechanism. Each time `hons`

is invoked, its arguments are
themselves converted, if necessary, to uniquely represented objects.

The ACL2 universe is recursively closed under the `cons`

pairing operation
and the atoms. Hash consing (`hons`

) is logically identical to `cons`

,
but a set of tables is used to record each `hons`

pair. When a `hons`

pair is requested, the implementation checks, in the current set of tables,
whether the requested pair already exists. If not, a new pair is created and
a record of that pair is made; otherwise, a reference to the previously
created pair is returned. Thus, any data object created with `hons`

has a
unique representation, as does every subcomponent. We also arrange for
strings to have a unique representation -- only one copy of each different
string is kept -- and when any previously unseen string is an argument to
`hons`

, we add the string to a unique table of strings. A system-wide
benefit of having a canonical representation for data is that equality
comparisons between any two data objects can be done in constant time.

The definition of `hons`

in no way changes the operation of `cons`

--
`hons`

creates a `cons`

pair. When asked to create a `hons`

, the
implementation checks to see if there is a `cons`

with the same `car`

and `cdr`

as the `hons`

being requested. Thus, the only difference
between the results of a `hons`

call and a corresponding `cons`

call is a
notation in some invisible (to the ACL2 logic) tables where we record which
`cons`

elements are also `hons`

elements. Since a `hons`

is nothing
but a `cons`

, the operation of `car`

and `cdr`

is unchanged. In fact,
the implementation is designed so that at any time it is safe to clear the
table identifying which `cons`

elements are also considered `hons`

elements.

User-defined functions with defined and verified guards can be memoized.
When a function is memoized, a user-supplied condition restricts the domain
when memoization is attempted. When the condition is satisfied, memoization
is attempted (assuming that memoization for the function is presently
enabled); otherwise, the function is just evaluated. Each memoized function
has a hash table that is used to keep track of a unique list of function
arguments paired with their values. If appropriate, for each function the
corresponding table is checked to see if a previous call with exactly the
same arguments already exists in the table: if so, then the associated value
is returned; if not, then the function is evaluated and a new key-value pair
is added to the table of memoized values so that some future call will
benefit from the memoization. With ACL2 user functions memoization can be
dynamically enabled and disabled. There is an ACL2 function that clears a
specific memoization table. And finally, just as with the `hons`

table, a
stack of these function memoization tables is maintained; that is, it is
possible to memoize a function, use it a bit, set the memoized values aside,
start a new table, use it, and then return to the original table.

We next discuss the fast lookup operation for association lists. When a pair
is added to an association list using the functions `hons-acons`

or
`hons-acons!`

, the system also records the key-value pair in an
associated hash table. As long as a user only used these two functions when
placing key-value pairs on an association list, the key-value pairs in the
corresponding hash table will be synchronized with the key-value pairs in the
association list. Later, if `hons-get`

is used to look up a key, then
instead of performing a linear search of the association list we consult the
associated hash table. If a user does not strictly follow this discipline,
then a linear search may be required. In some sense, these association lists
are much like ACL2 arrays, but without the burden of explicitly naming the
arrays. The ACL2 array `compress1`

function is analogous to the
functions `hons-shrink-alist`

and `hons-shrink-alist!`

. There are
user-level ACL2 functions that allow the associated hash tables to be cleared
and removed.

Finally, we would advise anyone who is using CCL in a research environment to
stay plugged into the ``trunk'' or ``bleeding edge'' of CCL development.
This is very easy to do by typing a few commands to a shell, for example
standing above the target directory as follows, provided one has `svn`

working.

For linux: rm -rf ccl svn co http://svn.clozure.com/publicsvn/openmcl/trunk/linuxx8664/ccl For an x86 Macintosh running the Darwin OS: svn co http://svn.clozure.com/publicsvn/openmcl/trunk/darwinx8664/ccl To keep up to date, you may find it sufficient to do: cd ccl svn update Whether obtaining a fresh CCL or just updating, finally issue these commands. ./lx86cl64 (rebuild-ccl :full t) (quit) ./lx86cl64 (rebuild-ccl :full t) (quit)

REFERENCES

Baker, Henry G., The Boyer Benchmark at Warp Speed. ACM Lisp Pointers V,3 (Jul-Sep 1992), pages 13-14.

Baker, Henry G., A Tachy 'TAK'. ACM Lisp Pointers Volume 3, July-September, 1992, pages 22-23.

Robert S. Boyer and Warren A. Hunt, Jr., Function Memoization and Unique Object Representation for ACL2 Functions, in the Sixth International Workshop on the ACL2 Theorem Prover and Its Applications, ACM Digital Library, 2006.

A. P. Ershov. On Programming of Arithmetic Operations. In the Communications of the ACM, Volume 118, Number 3, August, 1958, pages 427-430.

Eiichi Goto, Monocopy and Associative Algorithms in Extended Lisp, TR-74-03, University of Toyko, 1974.

Richard P. Gabriel. Performance and Evaluation of Lisp Systems. MIT Press, 1985.

Donald Michie. Memo functions: a Language Feature with Rote Learning Properties. Technical Report MIP-R-29, Department of Artificial Intelligence, University of Edinburgh, Scotland, 1967.

Donald Michie. Memo Functions and Machine Learning. In Nature, Volumne 218, 1968, pages 19-22.