Search-engine friendly clone of the ACL2 documentation.

Basic-tutorial

4. Proving Theorems by Symbolic Execution

To see how symbolic execution can be used to prove theorems, let's return to the bit-counting example, where our goal was to prove

(implies (unsigned-byte-p 32 x)
         (equal (fast-logcount-32 x)
                (logcount x)))

The basic idea is to first symbolically execute the above formula, and then check whether it can ever evaluate to nil. But to do this symbolic execution, we need some symbolic object to represent x.

We want our symbolic execution to cover all the cases necessary for proving the theorem, namely all x for which the hypothesis (unsigned-byte-p 32 x) holds. In other words, the symbolic object we choose needs to be able to represent any integer from 0 to 2^32 - 1.

Many symbolic objects cover this range. As notation, let b0, b1, ... represent independent Boolean variables in our Boolean expression representation. Then, one suitable object is:

Xinit = (:g-integer b0 b1 ... b31 b32).

Why does this have 33 variables? The final bit, b32, represents the sign, so this object covers the integers from -2^32 to 2^32 - 1. We could instead use a 34-bit integer, or a 35-bit integer, or some esoteric creation involving :g-ite forms. But perhaps the best object to use would be:

Xbest = (:g-integer b0 b1 ... b31 false).

since it covers exactly the desired range using the simplest possible Boolean expressions.

Suppose we choose Xbest to stand for x. We can now symbolically execute the goal formula on that object.

What does this involve? First, (unsigned-byte-p 32 x) produces the symbolic result t, since it is always true of the possible values of Xbest. It would have been equally valid for this to produce (:g-boolean . true), but GL prefers to produce constants when possible.

Next, the (fast-logcount-32 x) and (logcount x) forms each yield :g-integer objects whose bits are Boolean expressions in the variables b0, ..., b31. For example, the least significant bit will be an expression representing the XOR of all these variables.

Finally, we symbolically execute equal on these two results. This compares the Boolean expressions for their bits to determine if they are equivalent, and produces a symbolic object representing the answer.

So far we have basically ignored the differences between using ubdds or aigs as our Boolean expression representation. But here, the two approaches produce very different answers:

Since UBDDs are canonical, the expressions for the bits of the two numbers are syntactically equal, and the result from equal is simply t.
With AIGs, the expressions for the bits are semantically equivalent but not syntactically equal. The result is therefore (:g-boolean . phi), where phi is a large Boolean expression in the variables b0, ..., b31. The fact that phi always evaluates to t is not obvious just from its syntax.

At this point we have completed the symbolic execution of our goal formula, obtaining either t in BDD mode, or this :g-boolean object in AIG mode. Recall that to prove theorems using symbolic execution, the idea is to symbolically execute the goal formula and then check whether its symbolic result can represent nil. If we are using BDDs, it is obvious that t cannot represent nil. With AIGs, we simply ask a SAT solver whether phi can evaluate to false, and find that it cannot. This completes the proof.

GL automates this proof strategy, taking care of many of the details relating to creating symbolic objects, ensuring that they cover all the possible cases, and ensuring that nil cannot be represented by the symbolic result. When GL is asked to prove a non-theorem, it can generate counterexamples by finding assignments to the Boolean variables that cause the result to become nil.