Managing ACL2 Packages

Author: Jared Davis

Introduction

11/18/2004

At the 2004 ACL2 workshop, I gave a heartfelt endorsement of the ACL2 package system, describing it as "not actually that terrible." Some deemed my comments incredulous, perhaps having been burned by the package system before. And so, I have been conscripted to write something about how to use packages successfully.

So here goes something. It's mostly about packages, how to define them, how to use them in your own books, and how to certify books that use packages. I also mention briefly how to use local includes and redundant events to cleanly separate your library's interface from its implementation. This technique was shown to me by Eric Smith, so he deserves the credit here.

This guide is not comprehensive. If you want to go into more details, you should see the following documentation topics, among others.

Table of Contents

Creating Packages with DEFPKG

A new package is created by a defpkg event.

(defpkg name symbols)
Name is a string and MUST BE IN CAPS.
Symbols is a list of unique symbols.

You can also include a documentation string, but I won't try to fool you into believing that anyone uses this.

Picking a Name

The name is just a string and should be in caps. If it is not capitalized, bizarre incompatibilities can occur between various Lisp implementations. If I recall correctly, GCL is fine with lowercase package names, but really you should just make them uppercase to avoid these problems.

To access the symbols from package NAME, e.g., foo, you will generally need to write NAME::foo. As a result, you will want to pick a name which is short enough that you don't mind typing it. You can use hyphens in the name if you'd like to have more than a single word, but of course this is a lot of typing that you are committing to.

Generating the List of Symbols

The symbol list defines what symbols will be accessible from within the package. You certainly should be aware of the following constants:

*acl2-exports*
A list of 900+ function names, theorem names, macros, and commands such, which are ACL2 specific.
*common-lisp-symbols-from-main-lisp-package*
A list of 900+ function names from common lisp, including a lot of the functions which aren't part of ACL2.

You certainly want to include most of these symbols in your package. But, they have some overlap, so you cannot simply append them together. Instead, you should use the built-in function union-eq, which merges the lists without creating duplicates. The combined list has about 1700 symbols.

Of course, you might not want all of them. A lot of those symbols aren't useful to you; they are functions that are defined in Common Lisp but which don't exist in ACL2. And, they have nice names that you'd like to use, such as "pop", "union", etc.

One nice thing about packages is that you can explicitly exclude these names from your package, so that you are allowed to define them. Hence, it's possible to define "pop", "union", and so forth in your own package, even though you can't define them in the default ACL2 package.

To do this, simply identify the symbols you want to be able to use, and then use a set-difference-eq to remove them from your combined *acl2-exports* and *common-lisp-symbols-from-main-lisp-package* list.

Examples

Here is the command to define a very basic package that doesn't exclude any symbols.

(defpkg "VANILLA" 
  (union-eq *acl2-exports*
            *common-lisp-symbols-from-main-lisp-package*))

Here's a package that wants to have its own push and pop symbols.

(defpkg "STACK"
  (set-difference-eq
    (union-eq *acl2-exports*
              *common-lisp-symbols-from-main-lisp-package*)
    '(push pop)))

Managing DEFPKG Events with Package Files

It is going to become important that we can reuse these defpkg events. We will need to load the defpkg event whenever:

As a result, it seems best to put your defpkg event into its own file. I will call this file name.defpkg. Of course, this is just a Lisp file, the bizarre extension is just to remind us that it contains our package event.

The point of creating a whole file just for the defpkg event is that, by doing so, you avoid the need to replicate the defpkg event anywhere. (Instead, you can just ld the package file.) That means that if you ever want to change what symbols make up your package, you just need to change the package declaration in one place.

Working With Your Package Interactively

Suppose you are working on a library which is in the MYPKG package. Your library is not done yet, so you want to go ahead and do some work on it interactively. All you need to do is first issue the following commands:

(ld "mypkg.defpkg")
(in-package "MYPKG")

The ld command will load your defpkg event, and the in-package event will put you into your package.

From that point forward, everything should be pretty normal. The only problem that you will really run into is that you might try to use some symbol that isn't in *acl2-exports*. If you suspect this is the case, you might need to type ACL2:: in front of that symbol.

Now, almost all of the usual functions are already in *acl2-exports*, so this probably won't be a problem. And, if you think that the symbol you are using ought to be in *acl2-exports*, it's an easy thing for Matt to change in the next release. And in the meantime, you can always union-eq such a symbol into your defpkg event, and then you won't need to worry about it in the future.

Certifying Books In Your Package

Of course you would also like to certify books based on your package. This is easy if you are using the standard ACL2 Makefile system (which is pretty convenient anyway, so perhaps you'd like to switch to it, if you aren't using it yet). If we are working with the FOOBAR package, we just need a Makefile that looks like this:

include .../acl2-sources/books/Makefile-generic

# declaration of which books to certify

BOOKS = mybook1 mybook2 

# dependencies between books

mybook1.cert: foobar.defpkg  # require package at low levels!
mybook1.cert: mybook1.lisp

mybook2.cert : mybook1.cert
mybook2.cert : mybook2.lisp

I won't explain this in much detail. I will mention that you want all of your package's books to ultimately depend on the defpkg event, so that if you ever change the package, you will trigger a full recertification.

The second thing you need to do is copy the following into a file called cert.acl2.

(value :q)
(lp)
(ld "foobar.defpkg")

Now, all you have to do is have your (in-package "FOOBAR") command at the top of each file that should be part of package FOOBAR, and everything should work as when you are not certifying books.

In short, the only difference between working with your book interactively versus certifying it is that you need to run the ld command for your package file if you are going to work with it interactively. After you're ready to certify, just remove that line and you're good to go.

Including Books Based On Packages

Once books have been certified based on your package, you can include them into other ACL2 sessions. It is important to make a distinction between if you are including the book for interactive use, or if you are including the book for the certification of another book.

Inclusion for Interactive Use

If you simply want to include a package-based book while running ACL2 interactively, all you need to do is run the appropriate include-book command. You don't have to say anything about what packages are involved, and you don't need to load its defpkg events. Of course, to use your package's functions, you will need to qualify them with the package name. In other words, you can't just type union, you'll need to type SETS::union, and so forth.

If you find that you are working with those functions a lot, you might choose to work within the included book's package. To do this, you simply execute the form (in-package "SETS") or whatever your package is named, and then you do not have to quality the names any more. (Of course, now you might need to write ACL2:: and so forth.)

I'll make a few more comments about these sorts of issues when I talk about working with multiple packages at once.

Inclusion for Certification of Other Books

Now imagine that you have some package-based book, we'll say sets.lisp which is part of the SETS package. And, imagine that you want to certify some new book, graph.lisp, which is just part of the ACL2 package. Finally, imagine that graph.lisp is dependent upon sets.lisp.

Now, if you simply try to run (certify-book "graph" 0), you will get the following message:

Error: There is no package with the name SETS.

To fix this problem, we simply need to include the defpkg event for sets before we try to certify the book. There are two obvious places for this to go:

Now, if you are going to define your graph.lisp functionality to be in its own GRAPHS package, then this latter approach makes a great deal of sense. But, even if you are going to have GRAPHS be part of ACL2, there's no reason you can't still have a graph.defpkg file which simply loads the sets.defpkg file and doesn't actually create a new package for GRAPHS.

Example: Working with Multiple Packages at Once

I'll now give a slightly more complicated example, which is taken directly from my set theory library. The following is the contents of my sets.defpkg file, (which is actually called package.lisp in version 0.9 (the version that you'll find in the workshop supporting materials)).

(defpkg "INSTANCE"
  (union-eq '()
    (union-eq *acl2-exports*
              *common-lisp-symbols-from-main-lisp-package*)))


(defpkg "COMPUTED-HINTS"
  (union-eq '(mfc-ancestors 
              mfc-clause
              string-for-tilde-@-clause-id-phrase
              INSTANCE::instance-rewrite)
    (union-eq *acl2-exports*
              *common-lisp-symbols-from-main-lisp-package*)))

(defpkg "SETS"
  (set-difference-equal 
    (union-eq '(lexorder
                COMPUTED-HINTS::rewriting-goal-lit
                COMPUTED-HINTS::rewriting-conc-lit)
                (union-eq *acl2-exports*
                          *common-lisp-symbols-from-main-lisp-package*))
    '(union delete find map)))

As you can see, the set theory library is separated into three packages. The INSTANCE package constains a small rewriter which is used to create concrete instances of generic theories which are stored as constants. The COMPUTED-HINTS package contains support functions for creating and applying our style of computed hints. The SETS package is the main package which our users are interested in.

The INSTANCE package is "vanilla" in the sense that it isn't going to need to use any of the reserved function names -- it just wants the sort of "default" environment that we get by unioning the acl2 exports and the main list package symbols. The empty union-eq just lets me add new symbols to that, if I ever want to.

The computed hints package is a little more complicated. I want a few functions (mfc-ancestors, mfc-clause, and string-for-tilde-@-clause-id-phrase) which are not in *acl2-exports* because they are bizarre things and probably very few people want them. But, since I'd like to use them in this package without typing ACL2:: in front each time, I choose to import them explicitly. Finally, I also choose to import the instance-rewrite function from the INSTANCE package.

The sets package is a little more complex. Again I'm interested in lexorder, an obscure function which didn't make it to *acl2-exports*. But, I'm also interested in a couple of functions defined in the COMPUTED-HINTS package, so I go ahead and import those as well. Finally, I want my set library to go ahead and define the functions union, delete, find, and map, so I remove these from the list of imported symbols so that I am free to redefine them.

Now, these three packages are tightly related, and so I can go ahead and define all three defpkg events in this same file. But, I could make this better by creating an instance.defpkg file and a computed-hints.defpkg file separately, and then loading them as appropriate. If I were to do so, then anyone who wanted to include the INSTANCE package by itself would be able to do that, without needing to include the COMPUTED-HINTS and SETS packages. To really make this work, the computed-hints.defpkg file would need to (ld "instance.defpkg"), and similarly the sets.defpkg file would need to (ld "instance.defpkg"). As long as these paths are all relative, it seems like this should be a fine way of going about things.

In any event, after this is done, the only thing anyone needs to do is ld the sets.defpkg file whenever they need access to the defpkg events.

Should we use Exports List?

Imagine defining a constant, SETS::*exports*, that contained a list of function, theorem, and variable names that I expected users to encounter in their proofs. This could then be imported in the end-user's defpkg events. For example:

(ld "./xyz/osets/sets.defpkg")

(defpkg "FOOBAR"
  (union-eq SETS::*exports*
    (union-eq *acl2-exports*
              *common-lisp-symbols-from-main-lisp-package*)))

Given this, functions such as union, intersection, and so forth could perhaps be used in FOOBAR without having to explicitly say that they are from the sets package each time. This could save some typing, but it is perhaps a bad idea.

One problem with this is that union-eq is actually not what we want. I lied a little bit when I said that the list of symbols in a defpkg has to be unique: actually the list of symbols must not contain any two symbols which have the same symbol-name. As a result, COMMON-LISP::union and SETS::union are unfortunately incompatible in this list, so FOOBAR is probably not even a valid package! Realistically, we would need to either subtract from FOOBAR's symbol list the exact same functions we excluded from SETS (leading to an ugly duplication of code), or some kind of new functions, e.g., pkg-union, that would perform their equality tests simply using symbol name equality. These might be easy to add to the system, but I don't think they're available right now.

Even if we had this, would we want it? Right now, a user can still explicitly say which theorems and symbols they would like to import. In the example above, COMPUTED-HINTS does this for INSTANCE::instance-rewrite, and SETS does this for rewriting-goal-lit and rewriting-conc-lit from COMPUTED-HINTS. Although this might seem burdensome, it is safe! The client has to explicitly say what they want to take. In contrast, if the sets package publishes a list of exports, and I later add some symbol foo to that list, I might break your package which included all my exports but defined its own, different version of foo.

Local include-books With Redundant Events

This final topic isn't entirely related to packages, but relates to the discussion we had about conflicting versions required by different books and so forth.

Local Include Books

Imagine that you have constructed some large library of definitions and theorems about frobs, and suppose that for some of the properties you proved, you decided to use the arithmetic-3, data-structures, and the ordinals books.

Now, you really don't want all of these three books to be exported by your library. You're interested in frobs, not in arithmetic, data structures, and ordinals. Besides, you only needed these libraries to prove your theorems.

Just as you can make your lemmas local to an encapsulate or to a file, you can make your includes local as well. So, instead of just including the books verbatim, you can locally include them so that when other people include your book, those libraries won't even be seen. This lets us use your library without having to add all of the reasoning of arithmetic-3 and so forth.

Now, suppose someone else writes a theory about widgets, and they use the arithmetic-2 library to prove their properties. So long as they also used local includes, neither library exports any arithmetic properties and you will be able to include frobs and widgets in the same project, without worrying about these sorts of clashes.

The story is not quite so clean if I depend upon some definition. For example, if I include a sets library and use their notion of union to write my function, then my function depends on that definition of union. I don't have a good solution to this, but hopefully definitions will not change much from version to version, and will be kept in their own packages to avoid name clashes. This might be an argument for distributing libraries written as a set of definitions in one file, and a set of theorems in another. That way, people could publically include the definitions, but only locally include the theorems about them. This leads us nicely into our final topic.

Redundancy and Creating Nice Interfaces

Redundant definitions can be combined with local includes to produce some very nice effects. This technique is due to Eric Smith.

Imagine again that we are building our frobs library. We have a lot of definitions and a lot of theorems, and they are interleaved in a haphazard way that allows us to admit them all, verify their guards, prove some theorems, and so forth.

Just as you should think of a theorem in terms of its usage as a rewrite rule, we should think of our libraries in terms of the proof strategies they will engender. This haphazard and interspersed arrangement of definitions, lemmas, and theorems has allowed us to prove our main conjectures, but in the end, even the very order that our theorems appear will affect the "external" rewriting strategy to some degree. Unfortunately, this means that if we decide to change this internal order, our users will potentially be affected in some way.

So, what we would like to do is create a fiction for the user to believe. We create an interface file, and begin it by locally including all of our implementation books. We then redundantly define all of the events we would like to export.

This approach has a number of advantages. We hide the messy details of how we went about our proofs, we hide all of the internal definitions and induction schemes and what not which were needed. Because all the theorems have already been proven, we have no problems with changing the order to anything we would like. We have given ourselves the ability to clean up our implementation and otherwise rearrange it without impacting the reasoning strategy that we export. And, the end user can very rapidly include our file, because it has all of their definitions right there. (We're only talking about a few seconds of savings regardless, but this is still nice.)

This strategy creates an "impenetrable wall" of abstraction. This isn't just a disabled theorem that a user has to manually enable later, this is a theorem that the user never sees and never has the chance to enable. You effectively guarantee that nobody who is using your (unmodified) library can possibly be using your internal theorems, so you can feel free to axe or change them as you please.

Conclusions

Defpkg events really aren't too bad to work with. You do need to use them in a variety of contexts, so you probably want to make a file that contains them.

Once you have a defpkg event in its own file, it's quite easy to work interactively within the package (just ld the package file and then execute your in-package form), and to certify books based on the package (using the makefile system). Certified books can be easily loaded into interactive sessions, but it is messy to load them into other books which you intend to certify. This is really the most messy part of the whole deal, and you might need to explain it to your users somehow.

The defpkg event is pretty flexible, and lets you have nicely named functions like push, pop, union, etc, which wouldn't be allowed in the regular ACL2 package. You have a great deal of control over what symbols you choose to import, and you can work with several packages at once that pick and choose which symbols they would like to have.

Export lists are somewhat dangerous in the sense that they can break packages in the future, and they are somewhat painful to maintain. So, it seems easiest and safest to just let the people who are using your package import the symbols they intend to use directly.

Packages are, of course, intended to alleviate the namespace problem. The use of redundant events and local include books can also assist with this, and can keep your books from exporting more than you want. Ideally, libraries should be structured so that users include a public interface which is minimal and complete, while internally the implementation may include whatever it likes and do whatever is necessary to get the theorems proven.

Taken all together, hopefully these techniques will help us improve the modularity and reusability of our libraries.

PS - A Note on Generative Programming

By the way, a lot of us write a lot of macros that write code to do various and wonderful things. As part of that, we often have to generate our own symbols. This can be quite confusing because you now have to think about what package all of your symbols are in. Anyway, I don't know that there are any great solutions to this. I tend to add an argument :in-package-of to my macros and all of the new functions I create are created in that package. But, your needs may certainly vary. In any event, it's a bit harder, but still possible.