Search-engine friendly clone of the ACL2 documentation.

Start-here

Gentle-introduction-to-ACL2-programming

A Gentle Introduction to ACL2 Programming

by

J Strother Moore

Abstract

ACL2 is a logic and programming language in which you can model computer systems, together with a tool to help you prove properties of those models. “ACL2” denotes “A Computational Logic for Applicative Common Lisp”. The ACL2 programming language is a subet of side-effect free Common Lisp. Mathematically speaking, all ACL2 programs are functions. When two calls of a function supply the same objects as inputs the calls return the same results. If all you want to do is define ACL2 functions and run them, you don't need to know how to use ACL2 as a theorem prover. This introduction is all you need.

Getting Started

If you haven't installed ACL2 on your machine, do so, by following the instructions at “Obtaining, Installing, and License” on the ACL2 Home Page. As those instructions make clear, to run ACL2 you'll need a Common Lisp implementation. The instructions name several suitable ones.

In addition to the installation instructions, the Home Page has a wealth of documentation. You should visit “The User's Manuals”. There you'll see several versions of the manual, depending on whether you want information about utilities developed by users. But we recommend that newcomers just look at the basic “ACL2 User's Manual”. Explore it briefly just so you know how to find more information. For example, type “defun” into the Jump to box and hit return. You'll see details of how to define functions and there are many links in that discussion to related topics. But don't try to understand defun by reading all that. The manual is basically for users who are looking for technical information about features they basically know how to use. Having learned how to explore the manual, put it aside for now and focus on this document!

The interface to the ACL2 system is a read-eval-print loop. It prints a prompt. You type some command and hit return. It reads and evaluates your command. It prints the results. And then it prompts you for your next input.

You could use ACL2 with no other interface. But most users prefer an enviroment in which they can prepare a command by editing the text before submitting it. We use an Emacs shell buffer for that. Other users use ACL2s, which is an Eclipse plug-in. This guide doesn't discuss the interface further and we just assume you can submit commands and see the output. The examples we provide are from our Emacs interface.

ACL2 is implemented (largely) in ACL2. That is, almost all the system code is written in the ACL2 subset of Common Lisp. The examples provided here were prepared under ACL2 Version 8.5. Your output should be logically equivalent though the text may look a little different.

Now fire up your ACL2. You should see a prompt that looks like this

ACL2 !>

Type “:program” and a return.

ACL2 !>:program 
ACL2 p!>

The :program command tells ACL2 that subsequent definitions should merely be treated as ACL2 programs and not admitted into the logic. (To admit a program into the logic requires proving that the program terminates. We do not want to deal with proofs of any sort in this guide, so we'll stay in :program mode.)

Now type “(+ 2 2)”. You should see the answer, 4 printed on the next line. This illustrates the basic behavior of ACL2's read-eval-print loop. The display below shows several other examples too.

ACL2 !>:program 
ACL2 p!>(+ 2 2) 
4 
ACL2 p!>(+ 1 2 3 4) 
10 
ACL2 p!>(car '(hello world)) 
HELLO 
ACL2 p!>(cdr '(hello world)) 
(WORLD) 
ACL2 p!>(cons 'hello '(world)) 
(HELLO WORLD) 
ACL2 p!>

Sometimes computations cause errors, most commonly due to type violations or resource limitations. See the sections Type Errors and Stack Overflow below.

Here is an ACL2 program that takes a linear list of numbers and constructs a linear list of their squares. We've annotated it with Lisp comments. ACL2 ignores the comments. We'll explain the syntax later. For now we'll just interpret what this whole definition means.

(defun square-all (x) 
 
; Make a list of the squares of the elements of the list x. 
 
  (cond 
   ((endp x)                       ; If x is empty, 
    nil)                           ; return the empty list. 
 
   (t                              ; Otherwise, 
    (cons (* (car x) (car x))      ; square the first element and 
          (square-all (cdr x)))))) ; cons it onto the front of 
                                   ; the squares of the rest.

If you aren't sure what a “linear list” is just hang in and we'll get to it. But you've probably figured out that nil is the empty list, that cons constructs a list by “adding” an element to the front of another list, that car returns the first element of a non-empty list, and that cdr returns all but the first element of a non-empty list. Those deductions are right, as far as they go.

Here is what it looks like if you submit the square-all definition to the read-eval-print loop. We have omitted most of the previous commands and their output, and we have omitted the comments we included in the display above.

ACL2 !>:program 
... 
ACL2 p!>(cons 'hello '(world)) 
(HELLO WORLD) 
ACL2 p!>(defun square-all (x) 
  (cond ((endp x) nil) 
        (t (cons (* (car x) (car x)) 
                 (square-all (cdr x)))))) 
 
Summary 
Form:  ( DEFUN SQUARE-ALL ...) 
Rules: NIL 
Time:  0.00 seconds (prove: 0.00, print: 0.00, other: 0.00) 
 SQUARE-ALL 
ACL2 p!>

Once square-all is defined we can run it. Here are some tests.

ACL2 p!>(square-all '(1 2 3 4 5)) 
(1 4 9 16 25) 
ACL2 p!>(square-all nil) 
NIL 
ACL2 p!>(square-all '(-9 3/4 #c(0 5))) 
(81 9/16 -25) 
ACL2 p!>

Data Types

ACL2 supports five very simple but rather abstract types of data:

numbers,
characters,
strings,
symbols (which includes the two Booleans T and NIL), and
pairs or “conses”.

Objects of each type are distinct. That is, no number is a character or a string or a symbol or a pair. Etc.

All of these objects are “first class” in the sense that you can pass them in as values to functions, return them as values from functions, and store them in lists and trees.

Many operators are provided for each type. Each operator is a function.

No ACL2 operator or function can modify, mutate, or otherwise alter the objects passed in as arguments.

Since you're probably more familiar with conventional programming languages than functional (side-effect free) ones, let's talk operationally for a moment. (This is a disservice to you! The operational view is more complicated than the abstract mathematical view!) ACL2 is implemented on top of a von Neumann architecture. So what happens “behind the scenes” when the ACL2 function cons is called on 4 and 5? Mathematically it returns the pair < 4, 5 > which is printed in ACL2 as (4 . 5). But operationally, some memory is allocated for a new object, then the representations of 4 and 5 are stored in two fields (called the car and cdr fields) of that object, that region of memory is effectively tagged to indicate that it represents a cons pair, and a pointer to the newly allocated object is returned as the result of the call of cons. When that result is printed by ACL2 it appears as (4 . 5). The italicized sentence above means nothing you can do in ACL2 will change the contents of that newly allocated memory. In most programming languages you could invoke some procedure or method to overwrite the contents of the car field of that object, changing it from 4 to 3. No such procedure or method exists in ACL2. The object returned by that call of cons will print as the pair (4 . 5) — will be the pair (4 . 5) — for as long as you have access to the pointer returned by that call of cons.

ACL2 expressions are just nests of function calls. For example (factors (+ x (* 3 y))) calls the function factors on the value of x+3y. (We define factors below to return the list of prime factors of its input.) The functions + and * are examples of functions that operate on numbers.

Before we discuss ACL2 expressions further we must discuss the important issue of how you write down constants of each of the five types.

A few examples suffice to illustrate the common ways to enter numeric constants: 123, 22/7 and #c(3 5). Note that ACL2 provides “infinitely precise” rational numbers like 1/3 and 22/7. These are not floating point numbers; we don't provide floating point numbers. The rational 2/3 could also be written 4/6 — they are two different ways of writing the same number. The ACL2 number #c(3 5) is more conventionally written as the complex number 3+5i. ACL2 also supports writing numeric constants in binary, octal and hexadecimal notation, but we do not use that here.

We will list some operations on numbers later.

Here are examples of how to write character objects: #\A and #\a and #\Space. Character objects are primarily used to build up strings. While ACL2 provides functions for manipulating character objects (e.g., to recover their ASCII codes) we won't be using them here.

Strings are delimited by the double quote mark, e.g., "This is an error message." Again, there are functions more manipulating strings but we won't use them in this guide.

This leaves us with symbols and conses. Two symbols are easy to understand because they have counterparts in most programming languages: T is a symbol but it is also ACL2's Boolean “true” object. NIL is a symbol and is ACL2's Boolean “false” object. When an ACL2 predicate is said to be false, we mean the predicate returns NIL. When an ACL2 predicate is said to be true, it usually means it returns T, but more precisely it means it returns anything but NIL. Oddly, Lisp and ACL2 also use the symbol NIL as the “empty list”.

Before we discuss other ACL2 functions however we must deal at length with symbols and conses. Many first-time Lisp programmers do not really understand the idea of symbols as data or the idea of lists and trees as data.

Symbols

Symbols are sequences of alphabetic letters, signs and digits. So X, X*, and ABC23 are symbols. So are CAR, IF, and TRUE-LISTP. Note that dashes and other signs do not “break” symbols into pieces. TRUE-LISTP is one symbol, not two symbols separated by a minus sign.

Technically, a symbol can contain any sequence of characters and there are rules for writing symbols down so as to distinguish them from other constants. Using certain “escape” conventions you can actually write symbols containing spaces and parentheses, symbols composed entirely of digits, and symbols with other “weird” characters. But we will not go into these conventions in this introduction. We will limit ourselves to symbols that “look like” what you probably expect. Most often, symbols start with an alphabetic character and then contain just alphabetic characters, signs and digits. The first space, newline or parenthesis you encounter while reading a symbol marks the end of the symbol and isn't part of it.

All programming languages (except binary machine code) have symbols at the syntactic level. Symbols are used to denote variables, functions, operations, keywords, etc., in the syntax of most programming languages. This is true in ACL2 too. But ACL2 is a little different in that it also provides symbols as data objects. That is, in addition to familiar data objects like numbers and strings, ACL2 provides symbols as a kind of data structure that can be manipulated. For example, a symbol can be converted to a string object, or decomposed into its constituent characters, or several symbols can be concatenated to create a longer symbol. But in this introduction we will just pass symbols around, compare them to other objects, and create lists (conses) containing symbols as elements.

As noted, T and NIL are symbols with special significance. They are our “Booleans” data objects and NIL is used as the “empty list”.

Symbols are more complicated than described above because in addition to their names they belong to “packages.” For example, there might be a symbol named ABS in a package named "MATH" and a symbol named ABS in a package named "VECTORS". There is always one “current package” in ACL2 — you can select any package as the current package. If "MATH" were the current package, then when you write ABS you denote the symbol ABS in the "MATH" package. If you mean the symbol ABS in the "VECTORS" package you would have to write VECTORS::ABS. We will not use packages here.

There is a special package called the "KEYWORD" package. The symbol ABS in the "KEYWORD" package can, of course, be written KEYWORD::ABS. But special syntactic sugar also allows us to write :ABS for that symbol.

When we write symbols, case is unimportant (unless that previously mentioned “escape” convention is used). When reading a symbol all characters are all uppercased. That is, ACL2 reads the symbol x as X. The symbol The is read as THE and True-Listp is read as TRUE-LISTP. Thus, the choice of whether to write a symbol in uppercase, lowercase, or some mixture of cases is just a stylistic one.

ACL2 p!>'(Here is a list of SymBols!) 
(HERE IS A LIST OF SYMBOLS!) 
ACL2 p!>'(This list contains a |Weird Symbol|) 
(THIS LIST CONTAINS A |Weird Symbol|) 
ACL2 p>

ACL2 provides ways to construct symbols from strings, ways to construct strings from lists of character objects, and ways to construct character objects from integers. But elementary ACL2 programs rarely construct new symbols, strings and characters dynamically. The symbols, strings and characters manipulated during most program executions are entered initially in the input data (e.g., by reading a command line or input file). Therefore, we do not bother to describe how to “create” symbols, strings and characters and none of our examples create these objects dynamically.

Lists

We first discuss how to write down “cons pairs” and “ list objects”. Later we will describe the operations on them. There are many ways to write the same list constant. This should not be surprising; the same thing is true of numeric constants. Consider the fact that 123, 000123, +0123, 246/2 and #b1111011 are all ways to write down the same numeric constant. The form you choose when writing a numeric constant is up to you and is often chosen to emphasize some aspect of the data. For example, I might pad an integer with insignificant 0s to emphasize that it fits in a field of six digits, or I might unnecessarily write a + sign on the only positive integer in a data set. Like numbers in other programming languages, lists are so common in ACL2 that there are many ways to write the same list constant but the choice of “style” is just that: stylistic.

Most programming languages support the idea of some kind of record structure with fields containing other objects. The only such record structure in ACL2 is the ordered pair or cons pair. A cons pair is just a pair of two objects. Sometimes we call a cons pair a “list pair” or a “dotted pair” or simply a “list.”

Any two objects may be put together into a cons pair. The cons pair containing the integer 1 and the integer 2 might be drawn as:

  * 
 / \ 
1   2

or might be written in Cartesian coordinate notation as <1,2>. But in ACL2 it is written as the object (1 . 2). This is a single cons object in ACL2, with two integer objects as constituents. The left-hand constitutent is called the car of the pair. The right-hand constituent is called the cdr.

The tree we might draw as

or write in coordinate notation as <1, <2, 3>>, in ACL2 is written the object (1 . (2 . 3)). The car of this object is the integer 1. The cdr of this object is the object (2 . 3).

Similarly,

     * 
    / \ 
   /   \ 
  *     * 
 / \   / \ 
1   2 3   4

or <<1, 2>,<3, 4>> is written ((1 . 2) . (3 . 4)). Suppose x is the cons object just mentioned. Then the car of x is the object (1 . 2) and the cdr is the object (3 . 4). Obviously, the car of the car of x is 1. The cdr of the car of x is 2, etc.

The notation we are using to write list objects is called dot notation. It is a straightforward translation of the familiar coordinate notation in which parentheses replace the brackets and a dot replaces the comma.

But there are other ways to write these conses. This should not be surprising. How many ways can you think of to write the number 5? There are an infinite number! 5, 05, 005, +000005, etc., not to mention possibly #B101 and others. Those are all just “syntactic sugar” for denoting one familiar object. The sugars above might be described as “you can add a leading 0” and “you can add a single leading plus sign on a positive number.”

Similarly, ACL2 has two rules that allow us to write cons pairs in a variety of ways. The first rule provides a special way to write trees like:

  * 
 / \ 
1   nil

namely, you can write (1 . nil) or you can write (1). That is: if a cons pair has NIL as its cdr, you can drop the “dot” and the NIL.

The second rule provides a special way to write a cons pair that contains another cons pair in the cdr. The rule allows us to write (x . (...)) as (x ...). That is, when the cdr is a cons, you can drop the dot and the pair of balanced parentheses following it.

Thus, the tree

  * 
 / \ 
1   * 
   / \ 
  2   * 
     / \ 
    3   nil

can be written in any of the following ways.

(1 . (2 . (3 . nil))) 
(1 . (2 . (3))) 
(1 . (2 3)) 
(1 2 3)

It can also be written in many other ways, e.g., (1 2 . (3 . nil)) and (1 . (2 3)).

Binary trees that terminate in nil on the right-most branch, such as the one above, are often called linear lists. The elements of a list are the successive cars along the “cdr chain.” That is, the elements are the car, the car of the cdr, the car of the cdr of the cdr, etc. The elements of the list above are 1, 2, and 3, in that order. If we let x denote that tree, i.e., let x be (1 2 3), then the car of x is 1. The cdr of x is the linear list (2 3). The car of the cdr of x is thus 2. The cdr of the cdr of x is the linear list (3). The car of the cdr of the cdr of x is 3. And the “third cdr” of x is nil.

The linear list (A (B C D) E) has three elements. B is not an element of that list, it is an element of an element.

The following is a linear list of pairs.

    * 
   / \ 
  *   \ 
 / \   \ 
A   1   * 
       / \ 
      *   \ 
     / \   \ 
    B   2   * 
           / \ 
          *   \ 
         / \   \ 
        C   3   nil

This list can be written ((A . 1) (B . 2) (C . 3)). In full dot notation it is written

((A . 1) . ((B . 2) . ((C . 3) . NIL))).

By the way, lists such as the one above are so common they have a name. They are called association lists or alists and are used as keyed tables. The list above associates the key A with the value 1, the key B with the value 2, etc. Later we show functions for manipulating alists.

Lists are often used to represent more elaborate data structures. For example, an “account” might consist of a name, an id number, a balance and a list of transactions, laid out like this:

        * 
       / \ 
      /   \ 
     /     \ 
    *       * 
   / \     / \ 
  /   \   /   \ 
name  id bal  trans

An example of an account object might be

(("John Smith" . 123456789)    ; Name and id 
 . 
 (1254 .                       ; balance in cents 
   ((DEPOSIT (1 21 1999) 1000) ; transactions 
    (WITHDRL (1 30 1999)  850) 
    (WITHDRL (2  5 1999)  215))))

Rather than refer to these four components in terms of the car and cdr of the account object most programmers would define functions with application-specific names like account-name, account-id, etc. Instead of building each account object with a nest of cons expressions they would define (make-account name id bal trans) to do the consing.

Expressions

We've seen that the basic idiom for defining a new function in ACL2 is to write

(defun f (v1 ... vn) body)

where f is the name of the function, the vi are its input parameters, and body is an expression that computes the value of the function in terms of the parameters. The n parameters must be distinct variable names and we say that n is the arity of the new function f. But what are expressions?

With a few exceptions, an expression is either

a variable symbol,
a constant expression,
a call of a function of arity n on n argument expressions (sometimes called the actuals).

Variable symbols are written in the same way that symbol objects are written, except T, NIL, and keywords may not be used as variable symbols. Examples of variable symbols are thus X, k, temp3, and init-val.

There are two kinds of constant expressions: those that can just be written as ordinary objects and those that must be quoted. Character objects, strings, numbers, T, NIL, and keywords are of the first kind. They may be used as constant expressions without any additional signifying mark. All other symbols and all list objects must be preceded by a single quote mark (') to be used as constants in an expression. Examples of constant expressions of the first kind are thus #\Space, "EOF Error", 128, T, NIL, and :element. Constant expressions of the second kind include 'X, 'HELLO, and '(MON WED FRI).

Function calls are written by writing an open parenthesis, the name of the function and the n argument expressions, all separated by whitespace, followed by a close parenthesis. For example (cons (* (car x) (car x)) (square-all (cdr x))) is a function call of the function cons on two argument expressions. The first argument is a call of * on two expressions, both of which are calls of car on the variable symbol x. The second argument to the call of cons is (square-all (cdr x)), which is a call of square-all on a call of cdr on x.

Consider the list object (CAR X), which we can draw as

  * 
 / \ 
CAR \ 
     * 
    / \ 
   X  NIL.

This object can be written as (CAR X). But that is also the way we write the expression that denotes a call of the function CAR on X. So if we want to use that object in an expression we have to quote it to make it clear we mean the object rather than the expression.

We have said that an expression is either a variable, a constant, or a function call of a function symbol of arity n on n argument expressions. But there are exceptions to this simple characterization of expressions. One exception is that some functions appear to take varying numbers of arguments. For example, ACL2 allows + and * to take more than two arguments. So (+ (* x x) (* 2 x y) (* y y)) is allowed. We'll note other such functions when we use them here. Another exception is a special notation for nested if-then-else expressions described below. These exceptions are actually facilitated by a macro facility allowing the user to define new notation, but we will not discuss macros here even though we will use and explain some macros in our examples. Finally, ACL2 permits calls of unnamed functions, called lambda expressions, but we won't use them explicitly in this introduction. The reason we say “explicitly” is that we will use so-called let expressions to introduce some local temporary variables and ACL2 understands let expressions as applications of lambda expressions. We'll explain when we introduce let.

An example expression is

(cons (cons 'square-all (cons x nil)) 
      (cons '= 
            (cons (square-all x) nil)))

provided square-all has been defined as a function of arity 1. Note that case is actually unimportant in symbols, whether they are function or variable names or data objects. Unless special conventions are employed all symbols are read in uppercase.

Expressions have values computed with respect to an environment assigning values to variable symbols. For example, if square-all is defined as previously shown and x has the value (1 2 3) then the value of the expression above is

((SQUARE-ALL (1 2 3)) = (1 4 9)).

Here is the algorithm for determining the value of an expression in an environment. The value of a variable is just looked up in the environment. The value of a constant expression is the obvious object. The value of a call of a defined function is computed by (a) evaluating the n argument expressions to obtain a sequence of n objects, (b) creating a new environment in which the function's n parameters are assigned those respective objects, and then (c) evaluating the body expression of the function in that new environment. The value of a call of a primitive function, like * or cons, is determined by running special code on the values of the arguments.

For example, suppose we have added this definition

(defun sq (x) (* x x))

and we are in an environment where x has value 5. Then here is an annotated computation of the value of (sq (+ 1 x)).

(sq (+ 1 x)) 
=                ; evaluate argument of sq, when x = 5 
(sq 6) 
=                ; evaluate body of sq, when x = 6 
36.

Primitive Functions

The ACL2 system starts up with over eight thousand known functions, 32 of which are primitive (defined by special code rather than definitions). The rest are defined. Here is a brief summary of a few useful primitives. We have biased this list to encourage merely integer arithmetic and list processing. Some of the functions mentioned below are not actually primitive but are defined as simple compositions of primitives. For example, (natp x) is defined to be (and (integerp x) (<= 0 x)). Other “functions” mentioned below are actually macros that expand into primitive expressions.

(EQUAL x y) — t if x and y are the same object and nil otherwise
(IF x y z) — y if x is non-NIL and z if x is NIL. Put more colloquially, (IF x y z) means “if x is true, return y, otherwise return z”

(COND (p1 x1) (p2 x2) ... (T xn)) — an abbreviation for

(IF p1 
    x1 
    (IF p2 
        x2 
        (IF ... 
            ... 
            y)))

(AND x1 ... xn) — xn if each xi is non-nil, and nil otherwise
(OR x1 ... xn) — xi for the first xi that is non-nil, nil otherwise
(NOT x) — t if x is nil and nil otherwise
(NATP x) — t if x is a non-negative integer and nil otherwise
(INTEGERP x) — t if x is an integer and nil otherwise
(RATIONALP x) — t if x is a rational number and nil otherwise
(ZP x) — nil if x is a positive integer and t otherwise
(+ x1 ... xn) — sum of the xi
(* x1 ... xn) — product of the xi
(- x y) — x minus y
(/ x y) — quotient of x divided by y
(< x y) — t if x is less than y and nil otherwise
(<= x y) — t if x is less than or equal to y and nil otherwise
(CONSP x) — t if x is a cons pair and nil otherwise
(ATOM x) — t if x is a not cons pair and nil otherwise
(ENDP x) — t if x is nil or a not cons pair and nil otherwise
(CONS x y) — the ordered pair whose left-hand component is x and whose right-hand component is y
(CAR x) — left-hand component of x if x is a cons pair and nil otherwise
(CDR x) — right-hand component of x if x is a cons pair and nil otherwise
(CADR x) — (CAR (CDR X))
(CDDR x) — (CDR (CDR X))
(LIST x1 x2 ... xn) — (CONS x1 (CONS x2 ... (CONS xn NIL)))
(LIST* x1 x2 ... xn) — (CONS x1 (CONS x2 ... xn))
(SYMBOLP x) — t if x is a symbol and nil otherwise
(STRINGP x) — t if x is a string and nil otherwise
(CHARACTERP x) — t if x is a character and nil otherwise

You can learn more about these functions by executing them on some sample data. For example, the value of (AND T 23 33) is 33, (+ 1 2 3) is 6, (ZP 0) is T but (ZP 3) is false (i.e., NIL). (LIST 1 (+ 2 2) (+ 3 3)) has the value (1 4 6) but (LIST* 1 (+ 2 2) (+ 3 3)) has the value (1 4 . 6).

Perhaps more interestingly, the value of (EQUAL (CONS (- 6 1) 7) (CONS 5 (+ 5 2))) is T. “Behind the scenes” those two calls of CONS return two different pointers to two different newly allocated regions of memory, but the car and cdr fields of both objects are the same, namely 5 and 7 respectively. ACL2 cannot tell those two pointers apart. Operationally, EQUAL checks equality recursively down to the tips of both objects. But logically, both CONS expressions here return the pair (5 . 7).

Common Patterns of Recursion

Most of your function definitions in ACL2 will fall into certain common schemes. Here is an informal description of the three most common schemes. We define the function visit several different ways to illustrate common recursions. (In ACL2 you are not allowed to define a function more than one way; there is no “overriding,” “hiding,” or “overloading.” These multiple definitions are just a device for illustrating a lot of definitions without burdening you with many different names.)

Visiting every element of a linear list

(defun visit (lst ...) 
  (cond ((endp lst) ...)              ; No elements left to process. 
        (t .                          ; Some elements to process: 
            . 
             . 
              (car lst)               ; do something with this element 
 
              (visit (cdr lst) ...)   ; and visit the rest. 
             . 
            . 
           .)))

We illustrate this scheme in the next section.

Visiting every node and leaf in a binary tree

(defun visit (tree ...) 
  (cond ((atom tree) ...)             ; The tree is a leaf: do something. 
 
        (t .                          ; The tree is an interior node: 
            . tree                    ; do something with it, 
             . 
               (visit (car tree) ...) ;  visit nodes in left branch, and 
 
             . (visit (cdr tree) ...) ;  visit nodes in right branch. 
            . 
           .)))

Visiting every element of a linear list is just the special case of exploring a binary tree in which we visit only the successive subtrees along the cdr chain and consider the car of each such subtree as the “data,” or element, to be processed. We illustrate this recursion below.

Visiting all the natural numbers from n to 0

(defun visit (n ...) 
  (cond ((zp n) ...)                  ; N “is” 0: finish up! 
        (t .                          ; N is a positive integer: 
            . n                       ; do something with it, and 
             . 
              (visit (- n 1) ...)     ; visit the naturals below it. 
             . 
            . 
           .)))

The reason “is” is in quotation marks above is that (zp n) is true if either n is 0 or n is not a natural number. For example, (zp -3) and (zp 1/3) both return true! So when (zp n) is true you don't really know n is 0 but you do know there is nothing more to do, if you're thinking of n as a natural number. When (zp n) is false, you know n is a positive integer. The predicate zp is just a convenient way to recur through natural numbers and guarantees to halt the recursion even if you call the function on “unexpected” input, whether numeric or not.

Computing Results

When you define a recursive function you often have the choice of computing the results “on the way up” or “on the way down.”

Suppose x is a linear list of numbers and you wish to sum them. Here are two ways:

(defun sum-up (x) 
  (cond ((endp x) 0) 
        (t (+ (car x) (sum-up (cdr x)))))) 
 
(defun sum-down (x temp) 
  (cond ((endp x) temp) 
        (t (sum-down (cdr x) (+ (car x) temp)))))

For example (sum-up '(1 2 3 4)) is 10 and (sum-down '(1 2 3 4) 7) is 17. Both definitions fit within the “visiting every element of a linear list” recurrence scheme.

Note that sum-up takes a list of numbers and returns their sum while sum-down takes a list of numbers and some initial value and adds the numbers in the list to the initial value. Of course, (sum-up x) is equal to (sum-down x 0) — but only because the operation of addition is insensitive to the order in which the additions are done.

In many applications, the sum-up style is clearer, as it is obvious how each element of the list is contributing to the final value. In particular, each element is handled in exactly the same way regardless of what elements came before. In the sum-down style, the value of the temporary variable can, in principle, be used to affect the processing of subsequent elements. Information about the previous elements is being passed down. In more complicated functions written in the sum-down style, one must carefully inspect how temp is used to determine if it is just the final answer or is being used to direct the computation.

Of course, sometimes information about the elements already visited is necessary to decide how to process subsequent ones, in which case the sum-down style might be the appropriate choice.

Finally, when these two functions are compiled in a straightforward way, the recursion in sum-up may a call stack while the recursion in sum-down does not: sum-down is tail recursive. This means that the value of the recursive call is returned as the final answer and there is no need for the execution engine to “remember” to come back after the call to “do something.” In sum-up the execution engine must “remember” to add the two intermediate results together. Tail recursive calls can be compiled as simple jumps or “gotos” and often result in faster executions and use of no additional stack space. But all such considerations depend critically on the sophistication of the compiler.

Saving Intermediate Results

ACL2 is applicative so it does not have assignment statements. But it is possible to use “local variables” to save intermediate results. Here is a function that computes x^4 + y^4, i.e., the sum of the fourth powers of x and y.

(defun x4y4 (x y) 
  (let ((x2 (* x x))            ; Let x2 be x^2 and, “simultaneously,” 
        (y2 (* y y)))           ; let y2 be y^2. 
    (+ (* x2 x2)                ; Compute x^4 by squaring x^2 and 
       (* y2 y2))))             ; add it to y^4 computed similarly.

For example, (x4y4 2 3) is 16 + 81 or 97. The let expression has two parts, a list of variable bindings followed by a body. The bindings are written as a list of elements. Each element lists a variable name and a term. The terms are evaluated in parallel and then all the variables are bound to the values of the corresponding terms. Then the body is evaluated. The value of the body is the value of the let.

Let expressions are formalized in ACL2 as applications of anonymous functions. For example, (let ((v1 a1) (v2 a2)) (g v1 v2)) is actually transformed into ((lambda (v1 v2) (g v1 v2)) a1 a2) where the lambda expression is an anonymous function with two parameters, v1 and v2, and body (g v1 v2). The terms a1 and a2 that determine the values of the local variables in the let become the argument expressions on which the anonymous function is called.

While let binds its variables in parallel, let* binds its variables sequentially. The expression

(let* ((x2 (* x x))             ; Let x2 be x^2, and then 
       (x4 (* x2 x2))           ; let x4 be x2^2, and then 
       (x8 (* x4 x4)))          ; let x8 be x4^2. 
      x8)                       ; Return x8.

computes x^8.

Here is an example of a common use of let. Suppose you wish to find the length of the longest branchs in a binary tree. The function depth does that. It is an example of the “visit every node in a binary tree” scheme.

(defun depth (x) 
  (cond ((atom x) 0) 
        (t (let ((left-depth (depth (car x))) 
                 (right-depth (depth (cdr x)))) 
              (if (< left-depth right-depth) 
                  (+ 1 right-depth) 
                  (+ 1 left-depth))))))

Just to illustrate the definition and list notation, consider the tree

     * 
    / \ 
   /   \ 
  *     * 
 / \   / \ 
a   * e   * 
   / \   / \ 
  b   * f   g 
     / \ 
    c   d

The longest branches in this tree have length four and terminate in either c or d. This tree can be written

((a . (b . (c . d))) . (e . (f . g)))

or equivalently as ((a b c . d) e f . g).

The expression (depth '((a b c . d) e f . g)) has value 4.

We could have defined this function this way:

(defun depth (x) 
  (cond ((atom x) 0) 
        (t (if (< (depth (car x)) 
                  (depth (cdr x))) 
               (+ 1 (depth (cdr x))) 
               (+ 1 (depth (car x)))))))

But this would execute less efficiently because after recurring into both the car and cdr of x to determine which is deepest it recurs into one of them again to return the answer. The use of let eliminates this duplication of effort by saving the results of the earlier calls.

Still a third way to define depth might be

(defun depth (x) 
  (cond ((atom x) 0) 
        (t (+ 1 (max (depth (car x)) 
                     (depth (cdr x)))))))

where (max i j) is defined to be (if (< i j) j i). This definition executes almost as efficiently as the first one. The values of the two recursive calls are passed to the subroutine max. Max uses i twice and j twice but now that “duplication of effort” only requires looking up the values of variables, just as in the let case.

Sample Definitions

Functions on Linear Lists

The next several functions all illustrate the “visiting every element of a linear list” recurrence scheme.

(defun mem (e x) 
 
; Return t or nil according to whether e is an element of x. 
 
  (cond ((endp x) nil) 
        ((equal e (car x)) t) 
        (t (mem e (cdr x)))))

Equivalently:

(defun mem (e x) 
  (if (endp x) 
      nil 
      (if (equal e (car x)) 
          t 
          (mem e (cdr x)))))

or even

(defun mem (e x) 
  (and (not (endp x)) 
       (or (equal e (car x)) 
           (mem e (cdr x)))))

(Recall our caveat about multiple definitions of the same function. If you want to try one of these definitions in ACL2, fine. If you want to try all three, you'll have to give them distinct names.)

For all three of the above definitions, the following examples hold:

ACL2 p!>(mem 'd '(a b c d e f g)) 
T 
ACL2 p!>(mem 'd '(a b c   e f g)) 
NIL 
ACL2 p!>

Here is a function to concatenate two lists together:

(defun app (x y) 
  (if (endp x)       ;;; If x is exhausted, 
      y              ;;;    then return y 
      (cons (car x)  ;;;  else, cons the first element of x onto 
            (app     ;;;     the list obtained by recursively appending 
             (cdr x) ;;;     the rest of x 
             y))))   ;;;     to y.

Here are some examples of app.

ACL2 p!>(app '(1 2 3) '(a b c d)) 
(1 2 3 A B C D) 
ACL2 p!>(app '(1 2 3) nil) 
(1 2 3) 
ACL2 p!>(app (app '(1 2 3) '(4 5 6)) '(7 8 9)) 
(1 2 3 4 5 6 7 8 9) 
ACL2 p!>

Here is a function to find the first pair in an alist (if any) that binds a given key.

(defun lookup (key alist) 
  (cond ((endp alist) nil)             ;;; If alist empty, return nil, 
        ((equal key (car (car alist))) ;;; elseif the first pair contains key, 
         (car alist))                  ;;;    then return the first pair, 
        (t (lookup key (cdr alist))))) ;;; else look in the rest of alist.

Here are some examples.

ACL2 p!>(lookup 'b '((a . 1)(b . 2) (c . 3))) 
(B . 2) 
ACL2 p!>(lookup 'd '((a . 1)(b . 2) (c . 3))) 
NIL 
ACL2 p!>(lookup 'b '((a . 1)(b . 2) (b . 3))) 
(B . 2) 
ACL2 p!>

Note that only the first binding of the key matters.

Here is a function to “change” the value of a key in an alist. Note that it does not modify the alist but produces a new alist.

(defun store (key val alist) 
  (cond 
    ((endp alist)                 ;;; If alist is empty, 
     (list (cons key val)))       ;;;  then return an alist with one binding, 
    ((equal key (caar alist))     ;;; elseif the first pair binds key, 
     (cons (cons key val)         ;;;  then add the new binding to 
           (cdr alist)))          ;;;  the rest of the alist, 
    (t (cons (car alist)          ;;; else cons this pair onto the result 
             (store               ;;;  of recursively putting a new binding 
               key val            ;;;  of key to val in 
               (cdr alist))))))   ;;;  the rest of alist.

ACL2 p!>(store 'b 7 '((a . 1) (b . 2) (c . 3))) 
((A . 1) (B . 7) (C . 3)) 
ACL2 p!>(store 'x 26 '((a . 1) (b . 2))) 
((A . 1) (B . 2) (X . 26)) 
ACL2 p!>(store 'c 3 '((a . 1) (b . 2) (x . 26))) 
((A . 1) (B . 2) (X . 26) (C . 3))

It is important to recall that ACL2 is functional (side-effect free)! Despite its name, store does not change the alist it is given but returns a new one. Consider

ACL2 p!>(let* ((alist1 '((a . 1) (b . 2) (c . 3))) 
               (alist2 (store 'b 7 alist1))) 
         (list (lookup 'b alist1) 
               (lookup 'b alist2))) 
((B . 2) (B . 7)) 
ACL2 p!>

The above expression binds alist1 to an alist in which b is associated with 2. Then it uses store to bind b in alist1 to the new value 7 and calls the result alist2. Then it returns the list containing the values of looking up b in each of the alists. You might think the answer would be ((B . 7) (B . 7)) and it would be if store actually changed alist1 to create alist2. But in fact alist1 is not changed. The answer is ((B . 2) (B . 7)).

Functions on Binary Trees

Here is a function to count the number of tips of a binary tree. This function illustrates the “visiting every node of a binary tree” scheme.

(defun count-tips (x) 
  (cond ((atom x) 1)                   ;;; X is a tip: it counts 1. 
        (t (+ (count-tips (car x))     ;;; X is a node: sum the counts 
              (count-tips (cdr x)))))) ;;; of the left and right subtrees.

The function above does the computation “on the way up.”

Here is a version that does it “on the way down.”

(defun count-tips (x temp) 
  (cond ((atom x) (+ 1 temp))          ;;; X is a tip: add its count to temp. 
        (t                             ;;; Otherwise, x is a node:  Read the 
                                       ;;; nest of two calls below inside-out: 
                                       ;;; Count the tips in (car x) and add 
                                       ;;; that into temp.  Use the result as 
                                       ;;; temp when you count the tips in 
                                       ;;; (cdr x). 
         (count-tips (cdr x) 
                     (count-tips (car x) 
                                 temp)))))

This function has one tail recursive call.

Let x be the tree

     * 
    / \ 
   /   \ 
  *     * 
 / \   / \ 
a   * e   * 
   / \   / \ 
  b   * f   g 
     / \ 
    c   d

which we can write as '((a b c . d) e f . g). Then (count-tips x) is 7 (here we mean to use the first version of the definition). Similarly, (count-tips x 0) is also 7 (here we mean to use the second version).

Mutual Recursion

ACL2 requires that subroutines be defined before they are used in other definitions. So what do you do if you have two functions, f and g, and f calls g and g calls f? You define them together. This is called mutual recursion.

The example below shows a very inefficient way to factor a number into a list of primes.

(mutual-recursion 
 
(defun factors (n) 
 
; Collect the prime factors of n. 
 
  (factors-below n (- n 1))) 
 
(defun factors-below (n m) 
 
; Return the prime factors of n that are less than or equal to m. 
 
  (cond ((or (zp m) 
             (equal m 1)) 
         (list n)) 
        ((integerp (/ n m)) 
         (app (factors (/ n m)) 
              (factors m))) 
        (t (factors-below n (- m 1))))) 
)

We can read this as follows. To determine the factors of n, call factors-below on n and n-1 to collect all the factors of n less than or equal to n-1. To find all the factors of n less than or equal to m, consider three cases. First, if m is 0 or 1, return the list containing just n. Second, if n/m is an integer (i.e., m divides n), then use app to concatenate the factors of n/m to the factors of m. Third (and otherwise), find the factors of n below m-1.

We can run factors. For example (factors 12) returns (2 2 3). If we call (factors 123456789) the computation takes a while (27 seconds on my laptop) and returns (3 3 3607 3803). (Factors (- (expt 2 31) 1)) takes about 500 seconds and returns (2147483647).

Returning and Using Multiple Values

It is sometimes useful to define functions that return more than one result. Here is an example of a function that produces and uses multiple values. The function explores a binary tree and counts (in its second argument, nodes) how many cons nodes it sees and counts (in its third argument, tips) how many tips it sees. It returns both numbers as a list of length 2 (without actually creating the intermediate lists).

(defun node-and-tip-count (x nodes tips) 
  (cond 
   ((atom x)                ;;; X is a tip.  Return a vector (list) of 
    (mv nodes               ;;; two results: the number of nodes seen 
        (+ 1 tips))         ;;; and the number of tips seen (plus 1 for x). 
    ) 
   (t                       ;;; X is a cons node. 
 
      (mv-let (nodes-in-car tips-in-car) 
              (node-and-tip-count (car x) nodes tips) 
 
              ;;; The two lines above locally bind the variables 
              ;;; nodes-in-car and tips-in-car to the number of 
              ;;; nodes and tips, respectively, computed by the 
              ;;; recursive call of this function on (car x) 
              ;;; and the current running counts of nodes and tips. 
 
              ;;; Then, below, we count the number of nodes and 
              ;;; tips in (cdr x), starting with the counts for 
              ;;; the car (and adding one for the node x). 
 
              (node-and-tip-count (cdr x) 
                                  (+ 1 nodes-in-car) 
                                  tips-in-car)))))

For example,

(node-and-tip-count '((a . b) . (c . d)) 0 0) 
= (3 4)

Thus, the tree in question has 3 cons nodes and 4 (non-cons) tips.

If you want to use multiple values you should read the ACL2 documentation on that topic.

Type Errors

The ACL2 programming language is untyped. Objects have types that can be checked at runtime with predicates like integerp, consp, etc., but there is no syntactic typing. Any function can be applied to any object and ACL2 comes up with some value. It does this by defaulting ill-typed inputs to default values of the expected type. For example, + expects its arguments to be numbers. If you supply a non-numeric input to + it will use 0 instead. Thus, the value of (+ 3 T) is 3. Car and cdr expect their arguments to be either a cons pair or nil. (A special case in Lisp is that car and cdr return nil on nil.) If called on objects of the wrong type, car and cdr default the argument to nil.

But when you start up ACL2, the read-eval-print loop is configured to check types as it evaluates. Errors are signaled if types are violated. However, if you want it to just plow into ill-typed data and get whatever result the defaults produce, you can use the :set-guard-checking nil command as shown in the session below.

ACL2 p!>(+ 2 2) 
4 
ACL2 p!>(+ 2 T) 
 
 
ACL2 Error in TOP-LEVEL:  The guard for the function call (BINARY-+ X Y), 
which is (AND (ACL2-NUMBERP X) (ACL2-NUMBERP Y)), is violated by the 
arguments in the call (BINARY-+ 2 T). 
See :DOC set-guard-checking for information about suppressing this 
check with (set-guard-checking :none), as recommended for new users. 
To debug see :DOC print-gv, see :DOC trace, and see :DOC wet. 
 
ACL2 p!>:set-guard-checking nil 
 
Masking guard violations but still checking guards except for self- 
recursive calls.  To avoid guard checking entirely, :SET-GUARD-CHECKING 
:NONE.  See :DOC set-guard-checking. 
 
ACL2 p>(+ 2 T) 
2 
ACL2 p>(CAR 'ABC) 
NIL 
ACL2 p>

Note that the prompt changes after the :set-guard-checking command above. The exclamation mark (!) in the standard prompt indicates that types are being checked.

Stack Overflow

Sometimes a computation will consume more space than your machine has. For example, the following function constructs a list of n nils.

(defun nils (n) 
  (if (zp n) 
      nil 
      (cons nil (nils (- n 1)))))

We can run it on small values of n.

(nils 10) = (nil nil nil nil nil nil nil nil nil nil)

So that we don't have to read long lists of nils below we just take the length of the result with the ACL2 function len.

ACL2 p!>(len (nils 10)) 
10 
ACL2 p!>(len (nils 100)) 
100 
ACL2 p!>(len (nils 1000)) 
1000 
ACL2 p!>(len (nils 10000)) 
10000 
ACL2 p!>(len (nils 100000)) 
100000 
ACL2 p!>(len (nils 1000000)) 
1000000 
ACL2 p!>(len (nils 10000000)) 
*********************************************** 
************ ABORTING from raw Lisp *********** 
********** (see :DOC raw-lisp-error) ********** 
Error:  Stack overflow on value stack. 
... 
:q 
ACL2 p!>

But that last test, where we attempt to produce a list of ten million nils, causes a stack overflow in the Common Lisp configuration I'm running. (If your Lisp causes a Lisp error and enters an interactive break you'll need to exit the break to return to ACL2. Exactly how you do that depends on what Common Lisp you're running. For example, in GCL and CCL, you type :q followed by return, but in Allegro CL you type :reset followed by return, and in CMU CL you type q followed by return.)

If you are interested in doing ``large'' calculations with ACL2 you will have to learn about how to compile definitions and how to use declarations to optimize code. Those topics are beyond the scope of this document.

However, often you can just code up a more efficient algorithm. For example, the following tail-recursive version of nils won't overflow the stack.

(defun nils (n ans) 
  (if (zp n) 
      ans 
      (nils (- n 1) (cons nil ans))))

So now

ACL2 p!>(len (nils 10000000 nil)) 
10000000 
ACL2 p!>(len (nils 100000000 nil)) 
100000000 
ACL2 p!>(len (nils 1000000000 nil)) 
1000000000 
ACL2 p!>

Of course, it will eventually use up all the memory.

The tests above sort of suggest we're interested in the question “Does (nils n nil) always return a list of length n?” If that is the question, then (a) you'll never answer it by running all possible tests, and (b) you are using exactly the right tool!

The ACL2 system includes a theorem prover. Below we show the commands you need to execute to answer the question in the affirmative. We've elided the output from these commands.

ACL2 p!>:logic 
ACL2 !>(verify-termination nils) 
... 
ACL2 !>(defthm lemma-about-nils 
        (implies (natp n) 
                 (equal (len (nils n ans)) 
                        (+ n (len ans))))) 
... 
ACL2 !>(defthm main-theorem-about-nils 
        (implies (natp n) 
                 (equal (len (nils n nil)) n))) 
...

That is, switch back into logic mode, admit nils to the logic by proving termination, prove the general theorem that (nils n ans) returns a list whose length is n plus the length of ans, when n is natural number, and then prove the main theorem. Your job as an ACL2 “theorem prover driver” is to think of the commands. The prover does the work to complete each of those steps.

So maybe you should learn to prove theorems about your functions? If so, see “Recursion and Induction”.

Conclusion

You should be ready at this point to use ACL2 as a programming language to implement lots of simple list processing algorithms. Remember to go to the ACL2 User's Manual on the ACL2 Home Page to learn about other functions and features. If you need advice, check out the ACL2 Help list under the Mailing Lists link. The users who have volunteered to help are very responsive.

One final piece of advice: it is always best to code the simplest algorithm you can. Do not needlessly complicate your code. Optimize for performance only after confirming that the straightforward implementation is too inefficient for your application.