A Gentle Introduction to ACL2 Programming

J Strother Moore

This document, which accompanies the ACL2 Hyper-Card, is meant as a quick introduction to Lisp programming in the ACL2 dialect. The ACL2 Hyper-Card is available at

http://www.cs.utexas.edu/users/moore/publications/hyper-card.html

The Hyper-Card contains links into ACL2's huge hypertext documentation. This document doesn't, because it is meant to be read mainly on paper. After you've read this, look over the Hyper-Card and follow some of the links to get more information.

ACL2 is a mechanical theorem prover for reasoning about Lisp. The Lisp that ACL2 supports is a side-effect free subset of Common Lisp. If all you want to do is define Lisp functions and run them, you don't need to know how to use ACL2 as a theorem prover but merely as a syntax checker and execution engine.

I assume you have got ACL2 running on your computer. If you don't, see the Hyper-Card.

Here is an ACL2 program that copies a linear list replacing every integer element by the symbol I.

(defun replace-ints (x)

; Copy x and replace integer elements by symbol I.

  (cond ((endp x) nil)                    ; x is empty: return the empty list.

        ((integerp (car x))               ; The first element is an integer:
         (cons 'I                         ; add the symbol I to the front of
               (replace-ints (cdr x))))   ; the result of replacing the rest.

        (t                                ; The first element not an integer:
         (cons (car x)                    ; add the first element to the front
               (replace-ints (cdr x)))))) ; of the result of replacing the rest.

If you aren't sure what a ``linear list'' is or what we mean by the ``symbol'' I, just hang in and we'll get to it.

To program in ACL2 it is convenient first to enter :PROGRAM mode (by typing that key word to the prompt, followed by a return)

ACL2 !>:PROGRAM

In this mode ACL2 is not a theorem prover but just a syntax checker and Lisp evaluation engine. The prompt is changed when you enter :program mode. Then, if you type the defun expression above to the prompt (shown here without the comments):

ACL2 p!>(defun replace-ints (x)
  (cond ((endp x) nil)
        ((integerp (car x))
         (cons 'I (replace-ints (cdr x))))
        (t (cons (car x) (replace-ints (cdr x))))))

ACL2 will accept it and print a brief (and pointless!) summary.

Then you can call the function and ACL2 will print the resulting value:

ACL2 p!>(replace-ints '(A B C 1 2 I J K (4 . 5) 6))
(A B C I I I J K (4 . 5) I)

Note that the function did not replace the 4 and the 5 because they did not occur as elements of the list. They occur as components of an element, namely (4 . 5).

Data Types

ACL2 supports five very simple but rather abstract types of data:

numbers,
characters,
strings,
symbols (which includes the two Booleans T and NIL), and
conses (or ``lists'' or ``binary trees'').

All of these objects are ``first class'' in the sense that you can pass them in as values to functions, return them as values from functions, and store them in lists and trees.

Many operators are provided for each type. Each operator is a function.

No ACL2 operator or function can modify, mutate, or otherwise alter the objects passed in as arguments.

See the Hyper-Card for details on the operators provided for each type.

ACL2 expressions are just nests of function calls. For example (factors (+ x (* 3 y))) calls the function factors on the value of x+3y. (We define factors below to return the list of prime factors of its input.) The functions + and * are examples of functions that operate on numbers.

Before we discuss ACL2 expressions further we must discuss the important issue of how you write down constants of each of the five types.

A few examples suffice to illustrate the common ways to enter numeric constants: 123, 22/7 and #c(3 5). Note that ACL2 provides ``infinitely precise'' rational numbers like 1/3 and 22/7. These are not floating point numbers; we don't provide floating point numbers. The rational 2/3 could also be written 4/6 -- they are two different ways of writing the same number. The ACL2 number #c(3 5) is more conventionally written as the complex number 3+5i. ACL2 also supports writing numeric constants in binary, octal and hexadecimal notation, but we do not present that here.

Here are examples of how to write characters objects: #\A and #\a and #\Space. Character objects are used to build up strings. But very few ACL2 programmers ever manipulate character objects since strings can be entered directly as constants and new strings are seldom created in ACL2 applications.

We assume you know how to write down strings, e.g., "This is an error message". Strings are most often just returned as values to explain ``errors''.

This leaves us with symbols and conses. Two symbols are easy to understand because they have counterparts in most programming languages: T is a symbol but it is also the ACL2 object that is considered to represent the ``true'' truthvalue. NIL is also a symbol and represents the ``false'' truthvalue. When an ACL2 predicate is said to be false, we mean the predicate returns NIL. When an ACL2 predicate is said to be true, it usually means it returns T.

The expression (IF x y z) is the basic conditional operator of ACL2. It returns y if x is not NIL and z if x is NIL. Note that in this sense NIL is ``more important'' than T. An ACL2 predicate can indicate ``true'' by returning anything but NIL. T is just a convenient choice.

Nested IF expressions can be written with the COND notation:

(COND (p1 x1)
      (p2 x2)
      ...
      (T y))

is just shorthand for

(IF p1 
    x1
    (IF p2
        x2
        (IF ...
            ...
            y)))

The expression (EQUAL x y) is the basic equality operator in ACL2. It returns T or NIL according to whether x and y are equal, i.e., the same object.

Before we discuss other ACL2 functions however we must deal at length with symbols and conses. Many first-time Lisp programmers do not really understand the idea of symbols as data or the idea of lists and trees as data.

Symbols

Symbols are sequences of alphabetic letters, signs and digits. So X, X*, and ABC23 are symbols. So are CAR, IF, and TRUE-LISTP. Note that signs do not ``break'' symbols into pieces. ``TRUE-LISTP'' is one symbol, not two symbols separated by a minus sign.

Technically, a symbol can contain any sequence of characters and there are rules for writing symbols down so as to distinguish them from constants and other expressions. You can actually write symbols containing spaces and parentheses and other ``weird'' characters, using certain ``escape'' conventions. But we will not go into these conventions. We will limit ourselves to symbols that ``look like'' what you probably expect. Most often, symbols start with an alphabetic character and then contain just alphabetic characters, signs and digits. The first space, newline or parenthesis you encounter while reading a symbol marks the end of the symbol and isn't part of it.

All programming languages (except binary machine code) have symbols at the syntactic level. Symbols are used to denote variables, functions, operations, keywords, etc., at the syntactic level of most programming languages. This is true in ACL2 too. But ACL2 is a little different in that it also provides symbols as data objects. That is, in addition to familiar data objects like numbers and strings, ACL2 provides symbols.

For example, it is not surprising to think of X as a variable symbol and ABS as a function symbol in the language. But 'X is a constant expression denoting the symbol constant X, and 'ABS is the constant expression denoting the symbol constant ABS. Be careful! X is an entity at the syntax level of the language. 'X is an entity in the semantics. That is, 'X is a data object. You may pass it around, assign it to variables, compare it to other objects, store it in a list, etc. You may even tear it apart to obtain its constituent characters and use them to assemble a different object. We will almost never tear symbols apart or construct symbols from their constituent characters. So this document won't describe how to do that. The main thing you have to know about symbols is this: if you write a symbol you mean to use as data in an ACL2 program it must be quoted with the single quote mark. Thus, X is a variable; 'X is data.

Two special symbols are exceptions to this rule. The symbols T and NIL cannot be used as variables and always denote themselves. Thus, you can write 'T or T and 'NIL or NIL interchangeably when you are writing expressions to be evaluated.

Symbols are more complicated than described above because in addition to their names they belong to ``packages.'' For example, there might be a symbol named ABS in the package named "MATH" and a symbol named ABS in the package named "VECTORS". There is always one ``current package'' in ACL2 -- you can select any package as the current package. If "MATH" is the current package, then when you write ABS you denote the symbol ABS in the "MATH" package. If you mean the symbol ABS in the "VECTORS" package you have to write VECTORS::ABS. We will not make much use of packages.

There is a special package called the "KEYWORD" package. The symbol ABS in the "KEYWORD" package can, of course, be written KEYWORD::ABS. But special syntactic sugar also allows us to write :ABS for that symbol. Such keywords are special for another reason: they may not be used as variables and they always denote themselves. Thus :ABS is ':ABS.

When we write symbols, case is unimportant. Imagine that the characters are all read in uppercase. That is, 'x and 'X denote the same symbol, as do 'The and 'THE. Sometimes we write T and other times we write t. Similarly, we sometimes write nil for NIL, if for IF, equal for EQUAL, etc.

If you want to write down a symbol containing weird characters you should delimit the symbol with vertical bars. Thus, |A (weird) Symbol| is a symbol that contains two spaces, an open and close parenthesis, and some lower case characters in its name. We will not use such symbols here.

ACL2 provides ways to construct symbols from strings, ways to construct strings from lists of character objects, and ways to construct character objects from integers. But elementary ACL2 programs rarely construct new symbols, strings and characters dynamically. The symbols, strings and characters manipulated during most program executions were entered initially in the input data (e.g., by reading a command line or input file). Therefore, we do not bother to describe how to ``create'' symbols, strings and characters. The Hyper-Card provides pointers to the relevant functions.

Lists

We first discuss how to write down ``list constants'' and then we discuss how to ``create'' lists from other objects, including other lists. There are many ways to write the same list constant. This should not be surprising; the same thing is true of numeric constants. Consider the fact that 123, 000123, +0123, 246/2 and #b1111011 are all ways to write down the same numeric constant. The form you choose in which to write down a numeric constant is up to you and is often chosen to emphasize some aspect of the data, e.g., that it fits in 6 digits or is unusually signed or that most of the other data involved is rational instead of integral, etc. Lists are so common in ACL2 that there are many ways to write the same list constant but the choice of ``style'' is just that: stylistic.

After you've learned how to write down constants, we will introduce the common operations on them. In the numeric arena this is akin to introducing operators such as + and *. With such operators you can ``construct'' new numeric constants from old ones, e.g., (+ (* 12 10) 3) is a way to ``create'' 123.

List Constants

Most programming languages support the idea of some kind of record structure with fields containing other objects. The only such record structure in ACL2 is the ordered pair or cons pair. A cons pair is just a pair of two objects. Sometimes we call a cons pair a ``list pair'' or a ``dotted pair'' or simply a ``list.''

Any two objects may be put together into a cons pair. The cons pair containing the integer 1 and the integer 2 might be drawn as:

           *
          / \
         1   2

or might be written in Cartesian coordinate notation as <1,2>. But in ACL2 it is written as the constant '(1 . 2). This is a single cons object in ACL2, with two integer objects as constituents. The left-hand constitutent is called the car of the pair. The right-hand constituent is called the cdr.

The tree we might draw as

or write in coordinate notation as <1,<2,3>>, in ACL2 is written '(1 . (2 . 3)). The car of this object is the integer 1. The cdr of this object is the constant

'(2
. 3)

Similarly,

           *
          / \
         /   \
        *     *
       / \   / \
      1   2 3   4

or <<1,2>,<3,4>> is written '((1 . 2) . (3 . 4)). Suppose x is the cons object just mentioned. Then the car of x is the constant '(1 . 2) and the cdr is the constant '(3 . 4). Obviously, the car of the car of x is 1. The cdr of the car of x is 2, etc.

The notation we are using to write list constants is called dot notation. It is a straightforward translation of the familiar coordinate notation in which parentheses replace the brackets and a dot replaces the comma.

But there are other ways to write these conses. This should not be surprising. How many ways can you think of to write the number 5? There are an infinite number! 5, 05, 005, +000005, etc., not to mention possibly #B101 and others. Those are all just ``syntactic sugar'' for denoting one familiar constant. The sugars above might be described as ``you can add a leading 0'' and ``you can add a single leading plus sign on a positive number.''

Similarly, ACL2 has two rules that allow us to write cons pairs in a variety of ways. The first rule provides a special way to write trees like:

          *
         / \
        1   nil

namely, you can write '(1 . nil) as you would have guessed, or you can write '(1). That is: if a cons pair has NIL as its cdr, you can drop the ``dot'' and the NIL.

The second rule provides a special way to write a cons pair that contains another cons pair in the cdr. The rule allows us to write (x . (...)) as (x ...). That is, when the cdr is a cons, you can drop the dot and the balanced parentheses following it.

Thus, the tree

can be written in any of the following ways.

'(1 . (2 . (3 . nil)))
'(1 . (2 . (3)))
'(1 . (2 3))
'(1 2 3)

It can also be written in many other ways, e.g., '(1 2 . (3 . nil)) and '(1 . (2 3)).

Binary trees that terminate in nil on the right-most branch, such as the one above, are often called linear lists. The elements of a list are the successive cars along the ``cdr chain.'' That is, the elements are the car, the car of the cdr, the car of the cdr of the cdr, etc. The elements of the list above are 1, 2, and 3, in that order. If we let x denote that tree, i.e., let x be '(1 2 3), then the car of x is 1. The cdr of x is the linear list '(2 3). The car of the cdr of x is thus 2. The cdr of the cdr of x is the linear list '(3). The car of the cdr of the cdr of x is 3. And the ``third cdr'' of x is nil.

Of course, the elements of a list may be lists. For example,

     *
    / \
   *   \
  / \   \
 A   1   *
        / \
       *   \
      / \   \
     B   2   *
            / \
           *   \
          / \   \
         C   3   nil

This list can be written '((A . 1) (B . 2) (C . 3)). In full dot notation it is written

'((A . 1) . ((B . 2) . ((C . 3) . NIL))).

By the way, lists such as the one above are so common they have a name. They are called association lists or alists and are used as keyed tables. The list above associates the key A with the value 1, the key B with the value 2, etc. Later we show functions for manipulating alists.

Recall that we have to be careful with symbol constants so as not to confuse them with expressions in the syntax: X is a variable, 'X is data.

We also have to be careful with list constants. Consider

     *
    / \
   ABS \
        *
       / \
      X  nil

We can write this '(ABS X). It is a list whose car is the symbol constant 'ABS and whose cdr is the list constant '(X).

If we were to drop the single quote mark in front it we would get (ABS X), which is a completely different entity! (ABS X) is how we write the application of the function ABS to the value of the variable X.

Operations on Lists

The fundamental operations on lists are:

(cons x y) -- constructs a cons pair with x in the car and y in the cdr:
```
    *
   / \
  x   y
```
(car x) -- returns the car of the pair x
(cdr x) -- returns the cdr of the pair x
(consp x) -- returns t if x is a pair and nil if it is not.
(atom x) -- returns nil if x is a pair and t if it is not.
(endp x) -- returns nil if x is a pair and t if it is not.

Note that atom and endp are just different names for the same thing and are just the negation of consp. (endp x) is generally used to suggest that x is a linear list: it is true when x is nil and false when x is a cons pair.

Nests of car and cdr operations are so common that ACL2 gives special names to many of them. For example, (cadr x) is the same as (car (cdr x)) and (caddr x) is the same as (car (cdr (cdr x))).

   *
  / \
car  * cdr
    / \
 cadr  * cddr
      / \
  caddr  * cdddr
        / \
   cadddr   cddddr

The names car, cadr, caddr, and cadddr are especially useful because they return the first, second, third and fourth elements of a linear list (assuming the list has that many elements).

In fact ACL2 provides all the combinations of cars and cdrs up to nests of depth four. Thus, for example, caar is the car of the car. However, aside from a few ``personal favorites'' like cadr and caar most ACL2 programmers tend to define and use more nmemonic names, like (fourth x) or even the more general (nth 3 x) -- nth counts elements starting at 0.

Lists are often used to represent more elaborate data structures. For example, an ``account'' might consist of a name, an id number, a balance and a list of transactions, laid out like this:

        *
       / \
      /   \
     /     \
    *       *
   / \     / \
  /   \   /   \
name  id bal  trans

An example of an account object might be

'(("John Smith" . 123456789)    ; Name and id
  .
  (1254 .                       ; balance in cents
    ((DEPOSIT (1 21 1999) 1000) ; transactions
     (WITHDRL (1 30 1999)  850)    
     (WITHDRL (2  5 1999)  215))))

Most programmers, rather than refer to these four components as caar, cdar, cadr and cddr, would define functions with application-specific names like account-name, account-id, account-balance and account-transactions. Instead of building each account object with a nest of cons expressions they would define (make-account name id bal trans) to do the consing.

Also note above the use of symbols to name the various transaction types. Chances are most programmers would also define transaction objects, with their own component accessors like transaction-type, transaction-date, etc.

Two other useful operations are

(LIST x1 x2 ... xn) -- constructs a linear list with elements x1, x2, ..., xn:

(LIST* x1 x2 ... xn y) -- constructs a list with elements x1, x2, ..., xn and with y as the final cdr:

Common Recurences

Most of your function definitions in ACL2 will fall into certain common schemes. Here is an informal description of the three most common schemes. We define the function visit several different ways to illustrate common recursions. (In ACL2 you are not allowed to define a function more than one way; there is no ``overriding,'' ``hiding,'' or ``overloading.'' These multiple definitions are just a device for illustrating a lot of definitions without burdening you with many different names.)

Visiting every element of a linear list

(defun visit (lst ...)
  (cond ((endp lst) ...)              ; No elements left to process.
        (t .                          ; Some elements to process:
            .
             .(car lst)               ; do something with this element
                            
              (visit (cdr lst) ...)   ; and visit the rest.
             .
            .
           .)))

We illustrate this scheme in the next section.

Visiting every node and leaf in a binary tree

(defun visit (tree ...)
  (cond ((atom tree) ...)             ; The tree is a leaf: do something.

        (t .                          ; The tree is an interior node:
            . tree                    ; do something with it,
             .
               (visit (car tree) ...) ;  visit nodes in left branch, and

             . (visit (cdr tree) ...) ;  visit nodes in right branch.
            .
           .)))

Visiting every element of a linear list is just the special case of exploring a binary tree in which we visit only the successive subtrees along the cdr chain and consider the car of each such subtree as the ``data,'' or element, to be processed. We illustrate this recursion below.

Visiting all the natural numbers from n to 0

(defun visit (n ...)
  (cond ((zp n) ...)                  ; N ``is'' 0: nothing to do.
        (t .                          ; N is a positive integer:
            . n                       ; do something with it, and
             .
              (visit (- n 1) ...)     ; visit the naturals below it.
             .
            .
           .)))

The reason ``is'' is in quotation marks above is that (zp n) is true if either n is 0 or n is not a natural number. For example, (zp -3) and (zp 1/3) both return true! So when (zp n) is true you don't really know n is 0 but you do know there is nothing more to do, if you're thinking of n as a natural number. When (zp n) is false, you know n is a positive integer. The predicate zp is just a convenient way to recur through natural numbers and guarantees to halt the recursion even if you call the function on ``unexpected'' input, whether numeric or not.

Computing Results

When you define a recursive function you often have the choice of computing the results ``on the way up'' or ``on the way down.''

Suppose x is a linear list of numbers and you wish to sum them. Here are two ways:

(defun sum-up (x)
  (cond ((endp x) 0)
        (t (+ (car x) (sum-up (cdr x))))))

(defun sum-down (x temp)
  (cond ((endp x) temp)
        (t (sum-down (cdr x) (+ (car x) temp)))))

For example (sum-up '(1 2 3 4)) is 10 and (sum-down '(1 2 3 4) 7) is 17. Both definitions fit within the ``visiting every element of a linear list'' recurrence scheme.

Note that sum-up takes a list of numbers and returns their sum while sum-down takes a list of numbers and some initial value and adds the numbers in the list to the initial value. Of course, (sum-up x) is equal to (sum-down x 0) -- but only because the operation of addition is insensitive to the order in which the additions are done.

In many applications, the sum-up style is clearer, as it is obvious how each element of the list is contributing to the final value. In particular, each element is handled in exactly the same way regardless of what elements came before. In the sum-down style, the value of the temporary variable can, in principle, be used to affect the processing of subsequent elements. Information about the previous elements is being passed down. In more complicated functions written in the sum-down style, one must carefully inspect how temp is used to determine if it is just the final answer or is being used to direct the computation.

Of course, sometimes information about previous elements is necessary to decide how to process subsequent ones, in which case the sum-down style might be the clearest choice.

Finally, when these two functions are compiled, the recursion in sum-up requires a stack to execute while the recursion in sum-down does not: sum-down is tail recursive. This means that the value of the recursive call is returned as the final answer and there is no need for the execution engine to ``remember'' to come back after the call to ``do something.'' In sum-up the execution engine must ``remember'' to add the two intermediate results together. Tail recursive calls are compiled as simple jumps or ``gotos'' and result in faster executions that require less stack space.

Saving Intermediate Results

ACL2 is applicative so it does not have assignment statements. But it is possible to use ``local variables'' to save intermediate results. Here is a function that computes x^4 + y^4, i.e., the sum of the fourth powers of x and y.

(defun x4y4 (x y)
  (let ((x2 (* x x))            ; Let x2 be x^2 and, ``simultaneously,''
        (y2 (* y y)))           ; let y2 be y^2.
    (+ (* x2 x2)                ; Compute x^4 by squaring x^2 
       (* y2 y2))))	        ; and add it y^4 computed similarly.

For example, (x4y4 2 3) is 16 + 81 or 97. The let expression has two parts, a list of variable bindings and a body. The bindings are written as a list of elements. Each element gives a variable and a term. The terms are evaluated in parallel and then all the variables are bound to the values of the corresponding terms. Then the body is evaluated. The value of the body is the value of the let.

While let binds its variables in parallel, let* binds variables sequentially. The expression

(let* ((x2 (* x x))             ; Let x2 be x^2, and then
       (x4 (* x2 x2))           ; let x4 be x2^2, and then
       (x8 (* x4 x4)))          ; let x8 be x4^2.
      x8)                       ; Return x8.

computes x^8.

Here is an example of a common use of let. Suppose you wish to find the length of the longest branch in a binary tree. The function depth does that. It is an example of the ``visit every node in a binary tree'' scheme.

(defun depth (x)
  (cond ((atom x) 0)
        (t (let ((left-depth (depth (car x)))
	         (right-depth (depth (cdr x))))
              (if (< left-depth right-depth)
                  (+ 1 right-depth)
                  (+ 1 left-depth))))))

Just to illustrate the definition and list notation, consider the tree

          *
         / \
        /   \
       *     *
      / \   / \
     a   * e   *
        / \   / \
       b   * f   g
          / \
         c   d

The longest branch in this tree has length four and terminates in c (or d). This tree can be written

'((a . (b . (c . d))) . (e . (f . g)))

or equivalently as '((a b c . d) e f . g).

The expression (depth '((a b c . d) e f . g)) has value 4.

We could have defined this function this way:

(defun depth (x)
  (cond ((atom x) 0)
        (t (if (< (depth (car x))
                  (depth (cdr x)))
               (+ 1 (depth (cdr x)))
               (+ 1 (depth (car x)))))))

But this would execute less efficiently because after recurring into both the car and cdr of x to determine which is deepest it recurs into one of them again to return the answer. The use of let eliminates this duplication of effort by saving the results of the earlier calls.

Still a third way to define depth might be

(defun depth (x)
  (cond ((atom x) 0)
        (t (+ 1 (max (depth (car x))
                     (depth (cdr x)))))))

where (max i j) is defined to be

(if (< i j) j
i)

. This definition executes almost as efficiently as the first one. The values of the two recursive calls are passed to the subroutine max. Max uses i twice and j twice but now that ``duplication of effort'' only requires looking up the values of variables, just as in the let case. The only inefficiency in the max approach is that a function call (of max) is used to allocate the two local variables, while in the let case it is done ``inline.''

Sample Definitions

Functions on Linear Lists

The next several functions all illustrate the ``visiting every element of a linear list'' recurrence scheme.

(defun mem (e x)

; Return t or nil according to whether e is an element of x.

  (cond ((endp x) nil)
        ((equal e (car x)) t)
        (t (mem e (cdr x)))))

Equivalently:

(defun mem (e x)
  (if (endp x)
      nil
      (if (equal e (car x)) 
          t
          (mem e (cdr x)))))

or even

(defun mem (e x)
  (and (not (endp x))
       (or (equal e (car x))
           (mem e (cdr x)))))

(Recall our caveat about multiple definitions of the same function. If you want to try one of these definitions in ACL2, fine. If you want to try all three, you'll have to give them distinct names.)

For all three of the above definitions, the following examples hold:

(mem 'd '(a b c d e f g)) = t
(mem 'd '(a b c   e f g)) = nil

Here is a function to concatenate two lists together:

(defun app (x y)
  (if (endp x)       ;;; If x is exhausted,
      y              ;;;    then return y
      (cons (car x)  ;;;  else, cons the first element of x onto
            (app     ;;;     the list obtained by recursively appending
             (cdr x) ;;;     the rest of x
	     y))))   ;;;     to y.

For example, (app '(1 2 3) '(a b c d)) = '(1 2 3 a b c d).

Here is a function to find the pair in an alist that binds a given key.

(defun lookup (key alist)
  (cond ((endp alist) nil)             ;;; If alist empty, return nil,
        ((equal key (car (car alist))) ;;; elseif the first pair contains key,
         (car alist))                  ;;;    then return the first pair,
        (t (lookup key (cdr alist))))) ;;; else look in the rest of alist.

For example, (lookup 'b '((a . 1)(b . 2) (c . 3))) = '(b . 2). Thus, if alist is a list of pairs and (lookup key alist) returns non-nil, then the car of the answer is key and the cdr of the answer is the value of that key in alist. If there is no such pair in the alist, lookup returns nil.

Here is a function to ``change'' the value of a key in an alist. Note that it does not modify the alist but produces a new alist.

(defun store (key val alist)
  (cond
    ((endp alist)                 ;;; If alist is empty,
     (list (cons key val)))       ;;;  then return an alist with one binding,
    ((equal key (caar alist))     ;;; elseif the first pair binds key,
     (cons (cons key val)         ;;;  then add the new binding to
           (cdr alist)))          ;;;  the rest of the alist,
    (t (cons (car alist)          ;;; else cons this pair onto the result
             (store               ;;;  of recursively putting a new binding
               key val            ;;;  of key to val in
               (cdr alist))))))   ;;;  the rest of alist.

For example,

(store 'b 7 '((a . 1) (b . 2) (c . 3)))
  = '((a . 1) (b . 7) (c . 3))

(store 'x 26 '((a . 1) (b . 2)))
  = '((a . 1) (b . 2) (x . 26))

(store 'c 3 '((a . 1) (b . 2) (x . 26)))
 = '((a . 1) (b . 2) (x . 26) (c . 3))

It is important to recall that ACL2 is applicative. There are no side-effects! Despite its name, store does not change the alist it is given but returns a new one. Consider

(let* ((alist1 '((a . 1) (b . 2) (c . 3)))
       (alist2 (store 'b 7 alist1)))
  (cons (lookup 'b alist1)
        (lookup 'b alist2)))

The above expression binds alist1 to an alist in which b is associated with 2. Then it binds alist2 to an alist obtained by using store to ``change'' the association of b to 7. Then it looks up b in both alists and returns the pair of answers. You might think the answer would be

((b
. 7) . (b . 7))

and it would be if store actually changed alist1 to create alist2. But in fact alist1 is not changed. The answer is

((b . 2) . (b
. 7))

Functions on Binary Trees

Here is a function to count the number of tips of a binary tree. This function illustrates the ``visiting every node of a binary tree'' scheme.

(defun count-tips (x)
  (cond ((atom x) 1)                   ;;; X is a tip: it counts 1.
        (t (+ (count-tips (car x))     ;;; X is a node: sum the counts
              (count-tips (cdr x)))))) ;;; of the left and right subtrees.

The function above does the computation ``on the way up.''

Here is a version that does it ``on the way down.''

(defun count-tips (x temp)
  (cond ((atom x) (+ 1 temp))          ;;; X is a tip: add its count to temp.
        (t                             ;;; Otherwise, x is a node:  Read the
                                       ;;; nest of two calls below inside-out:
                                       ;;; Count the tips in (car x) and add
                                       ;;; that into temp.  Use the result as
                                       ;;; temp when you count the tips in
                                       ;;; (cdr x).
         (count-tips (cdr x)
                     (count-tips (car x)
                                 temp)))))

This function has one tail recursive call.

Let x be the tree

          *
         / \
        /   \
       *     *
      / \   / \
     a   * e   *
        / \   / \
       b   * f   g
          / \
         c   d

which we can write as '((a b c . d) e f . g). Then (count-tips x) is 7 (here we mean to use the first version of the definition). Similarly, (count-tips x 0) is also 7 (here we mean to use the second version).

Mutual Recursion

ACL2 requires that subroutines be defined before they are used in other definitions. So what do you do if you have two functions, f and g, and f calls g and g calls f? You define them together. This is called mutual recursion.

The example below shows a very inefficient way to factor a number into a list of primes.

(mutual-recursion

(defun factors (n)

; Collect the prime factors of n.

  (factors-below n (- n 1)))

(defun factors-below (n m)

; Return the prime factors of n that are less than or equal to m.

  (cond ((or (zp m)
             (equal m 1))
         (list n))
        ((integerp (/ n m))
         (app (factors (/ n m))
              (factors m)))
        (t (factors-below n (- m 1)))))
)

We can read this as follows. To determine the factors of n, call factors-below on n and n-1 to collect all the factors of n less than or equal to n-1. To find all the factors of n less than or equal to m, consider three cases. First, if m is 0 or 1, return the list containing just n. Second, if n/m is an integer (i.e., m divides n), then use app to concatenate the factors of n/m to the factors of m. Third (and otherwise), find the factors of n below m-1.

Compiling and Stack Overflow

We can run factors. For example (factors 12) returns (2 2 3). However, if we do

ACL2 p!>(factors 1404)

Error: Invocation history stack overflow.
Fast links are on: do (si::use-fast-links nil) for debugging
Error signalled by OR.
Broken at COND.  Type :H for Help.
ACL2>>:q

we get a stack overflow. Why? Because factors-below tries to factor 1404 by recursively considering all the natural numbers from 1403 on down. The error state we just entered is not ACL2 but the underlying Common Lisp engine that ACL2 is using. To get out of this error state we have to abort from the ``break.'' In Gnu Common Lisp, which we typically use, we type :q to the ``ACL2>>'' prompt above. This returns us to ACL2's command loop.

This raises another question. How can we compile this function so that it uses less stack space to execute?

The answer is to issue the command

ACL2 p!>:comp t

This causes ACL2 to compile all uncompiled functions.

Now if we do (factors 1404) it runs for a while and eventually returns (2 2 3 3 3 13).

Of course, factoring numbers is hard. If we type (factors 2775318) it runs for a long time, calls the garbage collector about two hundred times, and eventually returns (2 3 7 13 13 17 23). The algorithm we have coded up is not very efficient.

Returning and Using Multiple Values

Here is an example of a function that produces and uses multiple values. The function explores a binary tree and counts (in its second argument, nodes) how many cons nodes it sees and counts (in its third argument, tips) how many tips it sees.

(defun node-and-tip-count (x nodes tips)
  (cond
   ((atom x)                ;;; X is a tip.  Return a vector (list) of
    (mv nodes               ;;; two results: the number of nodes seen
        (+ 1 tips))         ;;; and the number of tips seen (plus 1 for x).
    )
   (t                       ;;; X is a cons node.
      
      (mv-let (nodes-in-car tips-in-car)
              (node-and-tip-count (car x) nodes tips)

              ;;; The two lines above locally bind the variables
              ;;; nodes-in-car and tips-in-car to the number of
              ;;; nodes and tips, respectively, computed by the
              ;;; recursive call of this function on (car x)
              ;;; and the current running counts of nodes and tips.

              ;;; Then, below, we count the number of nodes and
              ;;; tips in (cdr x), starting with the counts for
              ;;; the car (and adding one for the node x).

              (node-and-tip-count (cdr x)
                                  (+ 1 nodes-in-car)
                                  tips-in-car)))))

For example,

(node-and-tip-count '((a . b) . (c . d)) 0 0)
= (3 4)

Obviously, if you had a tree x and you wished to let n be the number of nodes in it and m be the number of tips, you could write:

(mv-let (n m)
        (node-and-tip-count x 0 0)
        ...)

and fill in the ellipses with whatever you want to do with n and m.