some subtle aspects of the counting done by accumulated-persistence

In this topic we cover the overcounting of ``useful'' and of recursive rune application attempts, and we describe how ``useless'' rune application attempts can actually be critical for a proof's success.

Overcounting of ``useful'' and of recursive rune application attempts. Not every rune application may be necessary for a proof's success. Consider for example:

(thm (equal (car (cons a (cdr (cons b x))))
Then show-accumulated-persistence will tell us that :rewrite rules car-cons and cdr-cons each had one useful application. However, the rule cdr-cons is used to simplify (cdr (cons b x)) to x, and this simplification is unecessary for the proof. Indeed, the proof succeeds even when preceded by the event: (in-theory (disable cdr-cons)). We thus see that a rune application labeled as ``useful'' may be simplifying a term that is not relevant to the proof.

As of this writing, we consider every :forward-chaining rule application to be ``useful'', for simplicity of the implementation. Moreover, our counting of these rules is such that a single rule may be counted more than once.

Next we show how recursive rule applications are overcounted. Consider the following example.

(defun mem (a x)
  (if (atom x)
    (or (equal a (car x)) (mem a (cdr x)))))
Now suppose we consider the sequence of theorems (mem a (list a)), (mem a (list 1 a)), (mem a (list 1 2 a)), (mem a (list 1 2 3 a)), and so on. We will see that the :frames reported for each increases quadratically, even though the :tries increases linearly; so in this case the :tries statistics are more appropriate. Each time the definition of mem is applied, a new stack frame is pushed (see accumulated-persistence), and all subsequent applications of that definition are accumulated into the :frames count for that stack frame. The final :frames count will be the sum of the counts for those individual frames, which form a linear sequence whose sum is therefore quadratic in the number of applications of the definition of mem.

How ``useless'' attempts can be critical for a proof's success. The command (accumulated-persistence :useless)] will list rules that did not contribute directly to the proof (see accumulated-persistence, in particular the discussion of ``useless'' there). However, a ``useless'' rule can on rare occasions be critical to the success of a proof. In the following example, we have a ``bad'' rule that can take the proof in the wrong direction, but a ``useless'' rule does a rewrite that prevents the succesful relieving of a hypothesis of the ``bad'' rule. In summary:

; Assume p0.  We want to prove p1.

; Key rule: p0 -> p1 = t

; Bad rule that could ruin the proof: p3 -> p1 = p2

; But unfortunately, we know p3: p0 -> p3

; Important ``useless'' rule, preventing ``bad rule'' above from firing: p3 = p4

The following event captures the rules described above.
 ((p0 (x) t)
  (p1 (x) t)
  (p2 (x) t)
  (p3 (x) t)
  (p4 (x) t))
 (local (defun p0 (x) x))
 (local (defun p1 (x) x))
 (local (defun p2 (x) x))
 (local (defun p3 (x) x))
 (local (defun p4 (x) x))

; Key rule: (defthm p0-implies-p1 (implies (p0 x) (p1 x)))

; Bad rule that could ruin the proof: (defthm p3-implies-p1-is-p2 (implies (p3 x) (equal (p1 x) (p2 x))))

; But unfortunately, we know p3: (defthm p0-implies-p3 (implies (p0 x) (p3 x)))

; Important ``useless'' rule, preventing p3-implies-p1-is-p2 from firing: (defthm p3-is-p4 (equal (p3 x) (p4 x))))

Now we can see that p3-is-p4 is labeled as ``useless'', by evaluating these commands.
(accumulated-persistence t)
(thm (implies (p0 x) (p1 x)))
If instead we first evaluate (in-theory (disable p3-is-p4)) before the thm above, then the proof fails, even though p3-is-p4 was labeled as ``useless''!

Nevertheless, in general it is probably safe to disable rules reported as ``useless'' by (show-accumulated-persistence :useless), and doing so may speed up a proof considerably.

Remark. The example above suggests a surprising fact: on rare occasions, a proof may fail when you give an :in-theory hint consisting of exactly the runes reported in a proof that succeeds. For, imagine a rule R that is needed in part of the proof but is ``bad'' in a second part, and that some other, ``useless'' rule prevents the application of R in that second part. The example above suggests that disabling this ``useless'' rule can allow the second application of R, thus preventing the proof.