Lisp Style and Efficiency

Gordon S. Novak Jr., Department of Computer Sciences, Univ. of Texas at Austin

Introduction

Suppose that you were taking a group of people to dinner at a restaurant and were given a long wine list from which to choose wine for the dinner. Suppose also that no prices were listed for any of the wines. You might be able to guess that the Gallo Hearty Burgundy would probably be affordable, while the Chateau Lafite Rothschild 1961 might do serious damage to your budget when bought in quantity. Someone who knew nothing about wines, however, might order the most expensive wines without knowing it, and then leave with the impression that the restaurant was overpriced.

Lisp is somewhat like the restaurant just described. The Lisp manual presents a long list of system functions, but without any performance information. The Lisp programmer who is inexperienced with efficiency issues may choose ways of implementing a program that sound reasonable, even elegant, but are breathtakingly expensive. This individual may then abandon the use of Lisp, condemning it as elegant perhaps, but too inefficient for ``real-world'' use. The author has often heard, ``I like Lisp, but I'm writing my application in C for efficiency.''

The problem is compounded by the fact that many of the Lisp functions needed for writing more efficient code are not taught in beginning Lisp courses, or are labelled as ``dangerous'' or in poor taste, or all of the above.

The goals of this short introductory exercise are to present some basic rules about efficiency of programs and to present some standard Lisp idioms that are used in writing efficient code. Readability and elegance of structure are worth more than ``saving a few microseconds'' in many cases. On the other hand, coding style that increases the computational complexity of a program can make it impossible to solve the desired problem, or make the program so slow that it is not usable in practice. For motivation, we have provided some small functions, written with only minor differences in the ways they are coded, but with striking differences in performance. After reviewing this chapter, we hope the student will be better able to choose from the long wine list presented by Lisp.

Basic Concepts of Program Performance

The basic concepts of program performance are simple: The total time required to execute a program is the sum of the times required to execute its parts. When a calculation that takes a fixed amount of time is repeated, the total time required is the time for one calculation multiplied by number of repetitions.

In general, the majority of the time required for execution of a program is spent in loops. We can therefore state several rules:

Computational Complexity of an Algorithm

Computer scientists speak of the computational complexity of an algorithm. This is often referred to in terms of the order of magnitude of the function f(n) describing the number of computation steps required to process an input of size n , written as O(f(n)) . O(f(n)) tells how the cost of executing the algorithm grows with the size of the data set to which it is applied, as the size becomes large. Conventionally, n is used to denote the size of the data set, and the function O(f(n)) tells how computation time varies with n as n approaches infinity. In general, the function O(f(n)) omits any constant multiplier or lower-order terms. Many simple computations are O(n) , or ``order n''; these take a fixed amount of time per data item. More complex computations are O(n * log(n)) , ``order n-log-n''; sorting a set of n items, for example, takes at least n-log-n time. Computations that are O(n^2) , O(n^3) , and so forth, are called polynomial time algorithms. Some of the worst problems are those that are called NP complete; the best algorithms known for these take exponential time. Unfortunately, A.I. involves many such algorithms; many search procedures, for example, are NP complete. While it may take exponential time to find a solution (say, a proof for Fermat's Last Theorem), it may take only polynomial time to check a solution (verify that a given proof is correct).

Importance of Computational Complexity

Sometimes programmers fail to appreciate the importance of efficiency. The attitude might be ``So my program is a little less efficient than it could be. So what? This is only a prototype.'' or ``Isn't elegance more important than efficiency?'' If the kind of inefficiency one is talking about is a constant factor, e.g., if Program A always takes twice as long to run as Program B, these attitudes may be justified. However, what if Program A is O(n^2) , while Program B is O(n) ? This is a very different matter. If n will always be small, it may not make much difference, but if n is large, Program B might finish in minutes, while Program A might not finish in a lifetime. Clearly, then, the complexity of a computation is a kind of efficiency we must worry about.

A complaint that is often heard about A.I. programs is that ``they don't scale''. That is, a program runs fine on a ``toy'' test case, but fails to run on the big case that is really of interest. Such complaints are often justified. What they mean, in many cases, is that the complexity of the computation is high; this can cause the behavior of working on small data sets but not on large ones. To write successful A.I. programs in Lisp, it is necessary to know how to avoid increasing the complexity of a computation unnecessarily; fortunately, this is something that can easily be learned.

The Bermuda Triangle

The Bermuda Triangle area off the coast of Florida has been the subject of many legends regarding ships that sailed off into the Triangle and never returned. Perhaps, then, this is a good metaphor for a kind of triangle that can cause a program to sail off and never return.

Suppose that there is a program loop that is repeated n times; within the loop is a computation that grows linearly with each pass through the loop. What is the order of the resulting computation? Consider the following functions that make a list of n integers from 0 to n - 1 ; copies of these functions can be found in the file bermuda.lsp .

(defun list-of-nums1 (n)
  (let (l)
    (dotimes (i n) (setq l (append l (list i))))
    l ))

(defun list-of-nums2 (n)
  (let (l)
    (dotimes (i n) (setq l (nconc l (list i))))
    l ))
(defun list-of-nums3 (n)
  (let (l)
    (dotimes (i n) (setq l (cons i l)))
    (nreverse l)))
These functions look much the same. The first version may be easiest to understand: it produces a list l that at any point in time is a list of numbers in the desired order. The second version is the same, but uses the ``destructive'' function nconc instead of append. The third version makes the list backwards using cons and nreverses it just before leaving; this sounds like extra effort! Load the file bermuda.lsp (compile it if that isn't hard to do) and try these functions with the argument n = 10. We can see that all versions produce the same result:
(list-of-nums1 10)   =   (0 1 2 3 4 5 6 7 8 9)

(list-of-nums2 10)   =   (0 1 2 3 4 5 6 7 8 9)

(list-of-nums3 10)   =   (0 1 2 3 4 5 6 7 8 9)

Now let's try some timings with a larger figure for n; we take the car of the result to avoid printing out the long list that is produced. Try these examples on your computer. (If your machine is not so powerful, or if you have a limited enthusiasm for learning things the hard way, you may wish to use a value smaller than 5000 for n.)
(time (car (list-of-nums1 5000)))   =   0
Run time:   319.20 s

(time (car (list-of-nums2 5000)))   =   0
Run time:    18.76 s

(time (car (list-of-nums3 5000)))   =   0
Run time:     0.14 s
Amazing! list-of-nums1 is some 2,280 times slower than list-of-nums3 on the test case where n is 5000! (The astute observer may note that this ratio is about half the size of n.) What about the case where n is 20 times as large?
(time (car (list-of-nums3 100000)))   =   0
Run time:     2.46 s
We see that list-of-nums3 takes about 20 times as long when n is 20 times as large; that is, list-of-nums3 is a linear or O(n) computation. We did not try list-of-nums1 with a value of 100000 for n; it would take an estimated 35 hours to run, since list-of-nums1 is an O(n^2) computation. This is a serious difference in efficiency: one version runs in seconds, while another takes days! Even if computer time were free, who wants to wait that long?

If we can get into this much trouble with a four-line function, imagine what could happen in a large A.I. application!

Now let's analyze the Bermuda Triangle and see what is happening. list-of-nums1 uses append, which appends lists to form a single list containing all of their elements. append is a safe function in the sense that it can never mess up any existing data structures; it achieves this safety by copying all of its arguments except the last, as illustrated in the following recursive version:

(defun append (a b)
  (if a (cons (car a) (append (cdr a) b))
        b))
Thus, if we call append with arguments '(A B) and '(C), it makes a copy of the list (A B) and reuses the list (C). We can see, therefore, that append contains a loop (whose length is equal to the length of the first list) and that it does a cons on each pass through the loop. After append makes a copy of its first argument, what happens to the original copy? If nothing else points to it, it becomes garbage (unused storage) and eventually gets garbage collected. The Garbage Collector is a highly paid worker who charges for each cons cell collected! Let us analyze what happens as we execute the loop in list-of-nums1:
i     l := (append l (list i))

0     ()  +   (0)
1     (0)   +   (1)
2     (0 1)   +   (2)
3     (0 1 2)   +   (3)
4     (0 1 2 3)   +   (4)
5     (0 1 2 3 4)   +   (5)
6     (0 1 2 3 4 5)   +   (6)
7     (0 1 2 3 4 5 6)   +   (7)
8     (0 1 2 3 4 5 6 7)   +   (8)    the whole triangle
9     (0 1 2 3 4 5 6 7 8) ---------> becomes garbage
       |               |
       |    copied     |   +
       V               V
      (0 1 2 3 4 5 6 7 8)---->(9)
This diagram illustrates what happens in successive iterations of the program. Each time, append is called with the existing list l and a new list of the single item i. append makes a new copy of the existing list l, adding the new item at the end. The old list, since nothing points to it (l gets reset to the new list) becomes garbage. Only the last list that is made is used; it is returned as the value of list-of-nums1. All of the storage contained in the triangle above the last line is garbage. Since this triangle has a height of n and a base of n - 1, its area is n*(n-1)/2. We see that list-of-nums1 is of order O(n^2) (we don't consider the constant by which the n^2 is multiplied, or any lower-order terms, in determining its order) and that it should be about n/2 times as costly as list-of-nums3 if the cons operation is the major cost; our experimental results seem to bear this out.

What about list-of-nums2, which uses nconc instead of append? (See Winston & Horn, Chapter 17 for a discussion of nconc.) Its performance is better than list-of-nums1, but still much worse than list-of-nums3. What is its computational complexity? Let's make a diagram for list-of-nums2:

   i     l := (nconc l (list i))

   0     ()  +   (0)
   1     (0)   +   (1)
   2     (0 1)   +   (2)
   3     (0 1 2)   +   (3)
   4     (0 1 2 3)   +   (4)
   5     (0 1 2 3 4)   +   (5)
   6     (0 1 2 3 4 5)   +   (6)
   7     (0 1 2 3 4 5 6)   +   (7)
   8     (0 1 2 3 4 5 6 7)   +   (8)
   9     (0 1 2 3 4 5 6 7 8)

  start      walk to end, |  and rplacd
   here,                  V   +
    ---->(0 1 2 3 4 5 6 7 8)----->(9)
Since nconc is used, there is no garbage; no cons cells are wasted while list-of-nums2 runs. Why, then, does it take more time? The answer is that nconc must start at the front of its first argument, walk down to the end one cell at a time, and make a modification at the very end. Thus, list-of-nums2 must walk the triangle, and it too is of order O(n^2) ; even though it is more efficient than list-of-nums1, we still would not find it usable for large data sets. So we arrive at some useful lessons:

Finally, let's examine the diagram for list-of-nums3:

   i     l := (cons i l)

   0     0   +   ()
   1     1   +   (0)
   2     2   +   (1 0)
   3     3   +   (2 1 0)
   4     4   +   (3 2 1 0)
   5     5   +   (4 3 2 1 0)
   6     6   +   (5 4 3 2 1 0)
   7     7   +   (6 5 4 3 2 1 0)
   8     8   +   (7 6 5 4 3 2 1 0)
   9     9   +   (8 7 6 5 4 3 2 1 0)
We see a triangle here too. However, the program doesn't do anything with this growing list; it only conses a new element onto the front of it, without examining or manipulating the list. What the program does is represented by the leftmost part of the diagram, which is constant and linear. This produces the linear performance behavior that we observe. However, what about the backwards list that requires us to use nreverse at the end of the function? Isn't that extra work? In a sense, it is. However, nreverse is linear, that is, it takes time proportional to the size of its argument. Thus, we see that:

CONS is Expensive

cons (the ability to dynamically create new data structures, with automatic recycling of old, unused storage via garbage collection) is one of the most valuable features of Lisp, but it is costly in execution time. While there have been many attempts to reduce the cost of the Lisp garbage collector, the cost has stubbornly remained high. This means that cons, the function that allocates new storage that is likely to become garbage before long, must also be regarded as expensive. The true cost of cons must include not only the cost of the cons function itself, but also the ``back end'' cost of garbage collection later. As a rough estimate, on a machine on which a basic operation like integer addition takes one microsecond, think of cons as taking 100 microseconds.

Therefore, the ``eleventh commandment'' for Lisp programmers is:

This does not mean that one should be paranoid about using cons; as in other areas of performance, the place to be careful is inside loops. This is especially true, as we have seen above, in the case of Lisp system functions that themselves contain loops in which conses are performed.

Backquote Does CONSes

The ``backquote'' facility is a shorthand way of specifying construction of list structure; it does conses each time the code is executed. Sometimes code that is ``more elegant'' requires more conses:
Two conses:                    No conses:

(equal x (list '- y))          (and (eq (car x) '-)
(equal x `(- ,y))                   (equal (cadr x) y))
The second form on the left, using backquote, is exactly equivalent to the first form, using list.

Functions that Copy

Sometimes conses are hidden inside other functions. append, for example, makes a new copy of all of its arguments except the last. reverse copies its argument. subst makes a new list structure, as needed. copy-tree makes a complete copy.

Many of the functions that copy list structure have counterparts that are ``destructive'', that is, they modify existing structure rather than making new structure. The motivation for using the ``destructive'' functions, of course, is that they do no conses. Often the ``destructive'' versions have names that begin with n, such as nreverse, nconc, and nsubst.

Beginning Lisp courses often label these functions ``dangerous'' and discourage their use. However, there is a simple condition under which it is always safe to use these functions:

How can we know that there is only one pointer? The usual way is when the structure is constructed within a function and is assigned to a local temporary variable:
(defun list-of-nums3 (n)
  (let (l)
    (dotimes (i n) (setq l (cons i l)))
    (nreverse l)))
In this case, the variable l is local, that is, defined in a let or argument list of the function. Because of the lexical scoping of Common Lisp, no other function can access this variable; therefore, we can be assured that only the variable l in this function points to this list and that the use of nreverse is completely safe. consing a list in backwards order and nreverseing it when done is a standard Lisp idiom.

Generate the Desired Result Directly

Sometimes beginning Lisp programmers find it easy to think about solving a problem in two stages: first, generate a data structure containing the desired result, but perhaps with some NIL values or extra structure (which when printed appears as extra parentheses); second, use another function to get rid of the unwanted NIL values and structure. However, this is not a very good approach: generation of two copies of the structure requires extra conses, filtering the structure takes extra time, and an extra function has to be written. Clearly, it is better to generate exactly the desired structure the first time. Two idioms can be helpful.

mapcan is a mapping function designed for filtering a list; it maps over a given list, making a list of the results for each element concatenated as if nconc were used. Suppose that we want to make a list of things that are pretty:

Original:                    Better:

(mapcar                      (mapcan
 #'(lambda (x)                 #'(lambda (x)
     (if (pretty x) x))            (if (pretty x) (list x)))
 lst)                          lst)
The mapcar form produces a list containing the pretty items, but also containing NILs in the place of items that are not pretty. While we still need to filter the first list, and may have done a lot of extra conses, the second list is just what we want: the mapcan version produces a list of the pretty items only. The rules for using mapcan are easy:

The second useful idiom, used when one needs to process a tree structure rather than a linear list, requires a bit more thought. mapcan is used mainly for filtering linear lists; it could, however, be used in a recursive function that traverses a tree. We will use as an example the task of computing the ``fringe'' of a tree, that is, making a list of the atoms at its terminal nodes in order; think of picking the apples from a tree while leaving the branches behind. For example,

(fringe '(((a) b ((c d (e)) f) (g)) h))
      =  (A B C D E F G H)
We can do this using mapcan as follows:
(defun mfringe (tree)
  (mapcan #'(lambda (x)
              (if (atom x) (list x) (mfringe x)))
          tree))
mfringe would be fine for many applications. With a large tree, however, it would suffer from the same problem of ``walking'' across an intermediate result that list-of-nums2 encounters. We might wonder whether there is a way of implementing this function that is analogous to list-of-nums3 and is similarly efficient; in fact, there is:
(defun fringe (tree) (nreverse (fringeb tree nil)))

(defun fringeb (tree basket)
  (if tree
      (if (atom tree) (cons tree basket)
          (fringeb (cdr tree)
                   (fringeb (car tree) basket)))
      basket))
In this version, we are doing two things: traversing the tree in the usual order of car first, then cdr, and carrying our basket of apples with us as we go. We use an auxiliary function, fringeb, which has the basket as a second argument; since fringeb will cons atoms onto the basket in backwards order, we nreverse the result before leaving fringe. The outer if of fringeb takes care of the case where the tree is nil; in that case, the result is the original basket. If tree is not nil, fringeb tests whether the tree is an atom; in Common Lisp, an atom is anything other than a cons cell. If the tree is an atom, the result is simply to cons it onto the front of the basket. Otherwise, we need to get the apples from the car half of the tree, then from the cdr half. To do this, we call fringeb recursively on the car part of the tree, passing down the same basket we were given as input; the result will be a new basket with all the atoms from the car of the tree consed onto the front of it. We use this result as the basket argument in the outer recursive call to add the atoms from the cdr part of the tree. Although the call to fringeb to process the cdr appears on the outside, the inner call to fringeb that processes the car will be executed first because it is an argument of the outer call.

The use of an ``extra'' function argument to hold a result that is being constructed is a frequently used idiom that helps in writing efficient Lisp code.

Use Control to Avoid Materializing Sets

Religions often have legends of gods who may assume material form on occasion to visit the earth, but whose preferred form of existence is not material. Materialization of a data set, that is, the creation of a new data structure containing all the desired data in exactly the desired form, is an expensive operation, especially if the data set is large. Instead of making a huge data structure containing all the data to be processed, it is usually more efficient to use control to process it a little bit at a time.

For example, suppose we wanted to print the names of things that are red, fast, and expensive from a catalog. We could write various set filters using mapcan:

(defun red-ones (l)
  (mapcan #'(lambda (item)
              (if (eq (color item) 'red)
                  (list item)))
          l))
Using this and similar functions, we could print the desired items:
(mapc #'(lambda (item) (print (name item)))
      (expensive-ones (fast-ones (red-ones catalog))) )
This works. However, it materializes three sets (the red things, the subset of those that are fast, and the subset of those that are expensive); all of these sets become garbage immediately after execution of the code that uses them. Instead, we could do:
(mapc #'(lambda (item)
          (if (and (fast item)
                   (red item)
                   (expensive item))
              (print (name item))))
      catalog)
This version does no conses. We have also ordered the tests to do the most unlikely test first. If few items are fast, more are red, and most are expensive, we will do fewer tests this way.

The example above is a simple one, since it involves filtering a single set. The same principle also applies to more complicated examples:

Recursive function execution uses a stack discipline for storage allocation. That is, when a new function is entered, storage used for local variables is allocated on a stack; when the function exits, the storage is popped off the top of the stack. This is a very efficient method of storage allocation and reclamation compared to the use of cons and garbage collection.

Lisp Style

There is no commonly accepted style for writing Lisp; even experienced Lisp programmers sometimes disagree sharply about matters of style. Moreover, many Lisp programmers have strong ``religious'' beliefs that certain styles are better than others.

Our suggestions are largely based on experience with student programs and are intended to help students acquire better Lisp style. Many of the comments are also motivated by concern for efficiency.

Simplify Code

Strunk & White's dictum, ``Vigorous writing is concise.'' applies to programming. Use the simplest form to express each part of a program. Shorter code is almost always better.
Original:                              Better:

(cond ((not (okay x)) nil)      (if (okay x) (fn x))
      (t (fn x)))
These two forms are identical when executed; the if form will return NIL if the (okay x) test fails. The if form is much shorter and clearer.

Think Recursively

Experienced Lisp programmers use nesting of function calls to make code shorter and cleaner:
Original:                              Better:

(if (> grade 70)            (print (if (> grade 70)
    (print 'ok)                        'ok
    (print 'failing))                  'failing))
Local variables that are only used once can often be eliminated by directly substituting the code that computes their values.
Original:                              Better:

(setq price                    (incf total
  (get item 'price))                 (* (get item 'price)
(setq subtotal                          quantity))
  (* price quantity))
(setq total (+ total subtotal))
The macros push, incf, and decf save code because the ``destination'' argument only needs to be mentioned once.

Use Symbolic Data

When blocks of similar code are repeated, that should be taken as a sign that the code should be restructured. Can use of symbolic data substitute for lots of code?
Original:                     Better:

(cond ((eq word 'one)  1)     (or (get word 'numval)
      ((eq word 'two)  2)         'infinity)
      ...
      ((eq word 'nine) 9)
      (t 'infinity))
The second form is faster in execution (no need to make all those tests), is easier to change without changing code (we could, for example, add numbers in Spanish with no changes to the code), and is compact and easy to read.

Use Canonical Forms

One way to avoid having to write many permutations of similar code is to use a canonical form: a convention that data will always appear in a certain way. For example, suppose that we wish to simplify an arithmetic expression of the form (+ lhs rhs). We could adopt the convention that if there is a numeric constant, it will always be on the left-hand side (lhs); this canonical form takes advantage of the fact that + is commutative.
Original:                       Better:

(defun splus (lhs rhs)          (defun splus (lhs rhs)
  (cond ((numberp lhs)            (cond ((numberp lhs)
          Lots of Code ...)               Lots of Code ...)
        ((numberp rhs)                  ((numberp rhs)
          Lots of Code ...)               (splus rhs lhs))
In this case, instead of writing the `Lots of Code' twice, we simply call splus again with the arguments reversed, so that the first set of code handles both cases. Don't just ``dive in'' and write vast amounts of code; stop and think whether there is an easy way to do it.

References

  1. Strunk, W. Jr. and White, E. B., The Elements of Style, MacMillan, 1972.
  2. Kernighan, B. W. and Plauger, P. J., The Elements of Programming Style, McGraw-Hill, 1974.
  3. Abelson, H. and Sussman, G. J., Structure and Interpretation of Computer Programs, MIT Press and McGraw-Hill, 1996.
  4. Harel, D., Algorithmics, Addison-Wesley, 1987.
  5. Winston, P.H. and Horn, B.K.P., Lisp, 3rd ed., Addison-Wesley, 1989, Chapter 17.

Gordon S. Novak Jr.