Major Section: INTRODUCTION-TO-THE-THEOREM-PROVER

You should design your rewrite rules to canonicalize the terms in your
problem, that is, your rules should drive terms into some normal form so
that different but equivalent terms are rewritten into the preferred shape,
making equivalent terms identical. You are very familiar with this idea
from algebra, where you learned to normalize polynomials. Thus, when you
see *(2x + 6)(3x - 9)* you automaticaly normalize it, by ``multiplying out
and collecting like terms,'' to get *(6x^2 - 54)*. This normalization
strategy allows you to recognize equivalent terms presented differently,
such as *6(x^2 - 9)*.

The ACL2 user is responsible for making up the rules. (Standard ``books''
-- files of ACL2 definitions and theorems -- can often provide rules for
some sets of functions, e.g., arithmetic.) This is a heavy burden on you
but it means you are in charge of your own normal forms. For example, if you
use the function `nthcdr`

, which returns the `n`

th `cdr`

of a list,
you might see both `(cdr (nthcdr i x))`

and `(nthcdr i (cdr x))`

.
These two expressions are equivalent but not identical. You will want to
decide which you want to see and prove the rewrite rule that converts the
other to that preferred form.

Most good users develop an implicit ordering on terms and rewrite ``heavy'' terms to ``lighter'' ones. This insures that there are no loops in their rewrite rules. But this ordering changes according to the user and the problem.

Generally, the lightest terms are primitives such as `IF`

, `EQUAL`

,
arithmetic, etc. Functions defined without explicit recursion tend to
be ignored because they are just expanded away (but see below). Recursively
defined functions tend to be heavier than any other recursive function used
in their definitions, so, for example, if `rev`

is defined in terms of `append`

,
`rev`

is heavier than `append`

. But the size and subtlety of recursively
defined functions also affects their place in the ordering.

But rewrite rules are surprisingly subtle. Recall that a rewrite rule can be made from a formula of this form:

(IMPLIES (ANDwherehyp1...hypk) (eqvlhsrhs))

`EQUAL`

or `IFF`

, and `IF`

. In such a rule, `IFF`

rule) the target occurs propositionally. There are other heuristic
restrictions that we won't discuss here.So how should you phrase a theorem in order to make it an effective rule?

General Principles:

* **Strengthen the Formula**: The fewer hypotheses a formula has the better
the rewrite rule formed from it. The more general the left-hand side the
better the rule. The fewer free variables in the hypothesis, the better.
The idea is to form a rule that fires predictably. Later in this tutorial
you'll get some practice formulating strong rules.

* **Choosing the Conclusion**: If a lemma is an implication, you have to
choose what the conclusion will be. (You'll also have to ``orient'' that
conclusion by choosing a left-hand side and a right-hand side, but we
discuss that below). You can swap the conclusion and any hypothesis by
negating both, producing a different conclusion. There are generally
two (possibly conflicting) heuristics for deciding which part of the formula
should be the conclusion:

*Choosing the Conclusion Heuristic 1*: Can you make the conclusion be an
`EQUAL`

or `IFF`

expression that has a ``heavy term'' on one side? That
will make a rule that replaces the heavy term with a lighter one. We
discuss this situation more below.

*Choosing the Conclusion Heuristic 2*: Can you make the conclusion be a
non-propositional term that contains all the variables mentioned in the
hypotheses? By ``non-propositional'' we mean a term that is not just the
propositional combination (e.g., with `AND`

or `OR`

) of other terms but
instead some call of some ``heavy'' function? If your conclusion contains
all the variables mentioned in the hypotheses, matching it will instantiate
all the variables in the hypotheses. That way ACL2 will not have to guess
instantiations of unbound variables when it tries to relieve the hypotheses.
It is not very good at guessing.

* **Orienting the Conclusion**: If the conclusion is an `EQUAL`

or an
`IFF`

, you have to decide which is the left-hand side and which is the
right. If the conclusion is `(NOT `

*lhs*`)`

, then the left-hand side
is *lhs* and the right-hand side is `NIL`

. If the conclusion is not an
`EQUAL`

, an `IFF`

, or a `NOT`

then the conclusion itself will be the
left-hand side and the right-hand side will be `T`

. If your lemma was
created by looking a Key Checkpoints while using The Method, the left-hand
side should match some term in that checkpoint.

Remember, the left-hand side is the ``trigger'' that will make the rule fire. It is the pattern that ACL2 will be looking for.

* **Pay attention to the free variables**: Look at the variables that occur
in the pattern (the left-hand side) and compare them to the variables that
occur in the hypotheses. Does some hypothesis contain a variable, say
*v*, that is not in the pattern? We call *v* a *free variable*
because it will not be assigned a value (``bound'') by the process of
pattern matching. ACL2 will have to guess a value for *v*. If some
hypothesis contains *v* as a free variable, ask whether more than one
hypothesis contains *v*? ACL2 uses the first hypothesis containing a free
*v* to guide its guess for *v*. To ``guess'' a value for *v*, ACL2
uses that hypothesis as a pattern and tries to match it against the
assumptions in the checkpoint formula being proved. This means that key
hypothesis must be in normal form, to match the rewritten assumptions of the
goal. It also means that you should reorder the hypotheses to put the most
unusual hypothesis containing a free *v* first in the list of conjuncts.
For example, if `v`

is free in two hypotheses, `(natp v)`

and
`(member (nthcdr v a) b)`

, then we recommend putting the `member`

term
first. There are likely to be many terms in the goal satisfying the `natp`

hypothesis -- or none if `natp`

has expanded to an integer inequality -- while
there are likely to be few terms satisfying the `member`

hypothesis, especially
if `a`

and `b`

are bound by the left-hand side of the rule.

Here are some (possibly conflicting) heuristics for choosing the left-hand side:

*Choose a Left-Hand Side that Occurs in a Key Checkpoint*: If you use the
Method you will tend to do this anyway, because you'll see terms in the Key
Checkpoints that you want to get rid of. But many moderately experienced
users ``look ahead'' to how the proof will go and formulate a few
anticipatory rules with the idea of guiding ACL2 down the preferred path to
the proof. When you do that, you risk choosing left-hand sides that won't
actually arise in the problem. So when you formulate anticipatory rules,
pay special attention to the functions and terms you put in the left-hand
sides. The next few paragraphs deal with specific cases.

*Avoid Non-Recursive Functions in the Left-Hand Side*: If the left-hand
side contains a call of a defined function whose definition is not
recursive, then it will almost never match any target in the formula being
rewritten unless the function is disabled. Suppose for example you have
defined `SQ`

so that `(SQ x)`

is `(* x x)`

. Suppose you considered
choosing a left-hand side like `(+ (SQ x) (SQ y))`

. Suppose you hoped it
would hit the target `(+ (SQ A) (SQ B))`

in some formula. But when ACL2
simplifies the formula, it will first rewrite that target to

(+ (* A A) (* B B))by expanding the definition of

`SQ`

, since it could do so without
introducing any recursive calls. But now the target won't match your rule.
By choosing a left-hand side that occurs in a Key Checkpoint (and is not one
of a handful of abbreviations ACL2 uses in its output like `AND`

,
`NOT`

), you'll avoid this problem since `SQ`

will have already been
expanded before the Key Checkpoint is printed.
*Disable Non-Recursive Functions*: If you insist on a left-hand side that
contains calls of non-recursive functions, remember to disable those
non-recursive functions after you've proved all the rules you want about them. By
disabling `SQ`

you can prevent ACL2 from expanding the definition as it
did above. Sometimes you will define a function non-recursively to
formalize some concept that is common in your application and you will want
to create a sort of algebra of rules about the function. By all means do
so, so you can conduct your formal reasoning in terms of the concepts you're
informally manipulating. But after proving the required laws, disable the
non-recursive concept so that ACL2 just uses your laws and not the messy
definition.

*Choose a Left-Hand Side Already in Simplest Form*: This is a
generalization of the advice above. If any rule will rewrite your left-hand
side, it will prevent your rule from matching any target. For example, if
you write a left-hand side like `(foo (car (cons x y)))`

then it would
never match any target! The reason is that even if `(FOO (CAR (CONS A B)))`

did occur in some goal formula, before ACL2 would try your rule about
`foo`

it will use the obvious rule about `CAR`

and `CONS`

to transform
your imagined target to `(FOO A)`

. Thus, your rule would not match.
So you have to keep in mind *all your other rules* when you choose
a left-hand side (and when you choose the hypotheses to guide free variable
selection). If you always choose a pattern that matches a term in a
Key Checkpoint, you avoid this problem.

*Make Sure the Left-Hand Side is ``Heavier'' than the Right*: Sometimes
this is obvious, as when you choose `(REV (REV x))`

for the left-hand side
and `x`

for the right. But what do you about `(REV (APPEND x y))`

versus `(APPEND (REV y) (REV x))`

? Most of the time we choose to drive
the heaviest function (in this case `REV`

) down toward the variables,
lifting the lighter function (`APPEND`

) up so that we can reason about the
lighter function's interaction with the surrounding ``matrix'' of the
formula. So we'd rewrite `(REV (APPEND x y))`

to `(APPEND (REV y) (REV x))`

,
not vice versa.

*Alternative Ways to Talk About the Same Thing*: If your problem and
specification use two different ways to talk about the same thing, choose
one form and rewrite the other into that form. For example, the ACL2
built-in `nth`

returns the nth element of a list, and the built-in
function `nthcdr`

returns the nth `cdr`

of a list. They are defined
independently. But `(nth n x)`

is the same thing as `(car (nthcdr n x))`

.
Since `nth`

can be expressed in terms of `nthcdr`

but not vice
versa, it is clear we should prove `(equal (nth n x) (car (nthcdr n x)))`

as a rewrite rule if both `nth`

and `nthcdr`

are involved in the
problem.

*Don't Let Computational Efficiency Dictate the Terms*: If you have two
functions that are equivalent (perhaps one was defined to be computationally
more efficient), prove their equivalence as a rewrite rule that eliminates
the more complicated function. An extreme example would be a model that
uses a sophisticated data structure (like a balanced binary tree, red-black
tree, ordered array, or hash table) to implement something simple like an
association of keys to values. By proving the equivalence as stated you can
eliminate the messy function early and do the bulk of your reasoning in
terms of its simple specification.

The best ACL2 users become very good at keeping all these things in mind when designing their rewrite rules. Practice makes perfect. Don't be afraid during your learning of ACL2 to undo the rules you first invented and try to make better ones.

Finally, be patient! There will be times when you think to yourself ``Why
should I spend my time thinking of rules that guide ACL2? I know the
proof!'' There are two reasons. First, you may ``know'' the proof but you
may well be wrong and part-way through this whole exercise you may realize
that you're missing a major hypothesis or special case that breaks your
whole conception of the problem. The proof is in the details. Second, most
of the time the library of rules you develop in this process will be used
over and over again on variants of the main problem in the months and years
ahead. This is sometimes called the *proof maintenance* problem.
Theorems don't suffer bit rot! But the artifacts you're modeling change and
you will need to prove new versions of old theorems. A good general purpose
library makes this much easier.

We now recommend that you practice inventing strong rules; see strong-rewrite-rules.

For advice on handling specific kinds of formulas and definitions, see specific-kinds-of-formulas-as-rewrite-rules.

For more information about the rewriter works and how rules are created, see further-information-on-rewriting.

If you are working your way through the tutorial introduction to the theorem
prover, use your browser's **Back Button** to return to
introduction-to-the-theorem-prover.