Some Intentionally Provocative Remarks

Compared to the other ``Best Ideas'' I've mentioned, this one is really trivial, but I think it is has important practical value to people who want to design ad hoc programming languages and wish to verify applications written in those languages. And a consequence of this little idea is rather surprising to me:

Had there been decent theorem provers in 1960s, Floyd and Hoare would never have had to invent Floyd-Hoare semantics!

Below I give a personal view of our CS history to back them up. Note: This page has been translated into Romanian by Aleksandra Seremina. Her translation is here, however, be advised this link takes you out of my web space.

Operational semantics were understood before Floyd-Hoare semantics. For example, in 1962 in his seminal ``Towards a mathematical science of computation,'' McCarthy wrote ``the meaning of a program is defined by its effect on the state vector.''

Operational semantics have some strengths that Floyd-Hoare semantics don't.

In contrast to Floyd-Hoare semantics, operational semantics have the following properties:

They may be formalized in a classical mathematical framework, e.g., first order logic with induction, not a mix of logic and programming language syntax. In particular, they require no new rules of inference beyond those studied by logicians for over a hundred years.
One can use traditional mathematical techniques not only to prove total and partial correctness of specific programs, but to prove properties of the programming language itself, and to relate different languages.
One can write operational semantic models so that they are directly executable and can thus be used as interpreters for the given programming language.
It is generally straightforward to compare an operational semantic model to an implementation of the language because both are described in terms of operations on the data values of the language.

However, Floyd-Hoare semantics have a wonderful property: code proofs via the inductive assertion method are relatively simple. That is because all of the proof obligations are directly addressing properties of the user's code and all language issues have been distilled out of them.

But, as far as I can tell, nobody knew how to conduct a standard inductive assertion style proof directly in an operational semantics. The ``Best Idea'' I describe below shows how to do that! It's really trivial and just requires a good theorem prover to mechanize it.

The inductive assertion style of reasoning -- where one annotates the program text with assertions that are supposed to be true every time they are encountered during execution -- goes back to Goldstine and von Neumann (1946) and Turing (1949). Those early applications of the method were informal in the sense that no formally defined process was given to generate the formulas to prove; instead, the early practitioners just got on with their proofs. The point was not formality but ordinary mathematical precision and argument.

But all formal presentations of the method essentially describe a meta-logical transformation process, often called verification condition generation (VCG), which is a rule of inference allowing one to prove assertions from other assertions and the intervening program text. Technically, the VCG rule is a derived rule which can be shown to be valid from an operational semantic description of the programming language. Both Floyd and Hoare were quite explicit about the use of operational semantics to justify the validity of their techniques. I suspect that anyone familiar with their seminal papers would recognize the idea here as just the application of their soundness arguments to a given annotated program. So my idea isn't really new. That makes it all the more surprising that no one ever used it!

Programming logics were introduced to make the VCG process primitive with respect to the formal system and to unify programming and proof.

Nevertheless, when all was said and done, the implementation of programming logics generally consist of two parts. The first part is a VCG and the second part is a theorem prover for the verification conditions (VCs).

But defining a semantics for a modern programming language, like Java, is hard (i.e., error prone) because so much happens in typical language constructs like method invocation, building a new object, or writing to a field in an object. And Hoare semantics requires that the designer describe all of this at the meta level: as a formula transformation rather than as a data (or state) transformation. To make the process of writing a VCG even more error prone, it is widely agreed that some theorem proving must be done ``on the fly'' as formulas are generated, since otherwise the number and size of the VCs grow excessively.

Thus, to use a mechanized programming logic one generally has to trust two very large and complicated tools: a VCG and a theorem prover.

But it is trivial to use the inductive assertion method in an operational semantic setting. It just requires a decent theorem prover to do the symbol pushing. If you phrase the problem the ``right'' way, you can get the theorem prover to generate simplified verification conditions on the fly. And so you only have to trust one tool: the theorem prover.

Had we had decent theorem provers in the early 1960s, I believe this fact would have been as obvious as it is now and it would never have been necessary to invent programming logics.

Advice

Why wade into a swamp when there is a clear road around it?

Today, lots of people are designing programming languages, e.g., microcode engines for special-purpose devices. These are often safety-critical and there is understable interest in verifying the programs to be written. Such language implementors might be tempted to formalize their new language by writing a VCG. Don't! It's an error-prone activity. You're going to have to have a trusted theorem prover anyway. So why not write your semantics operationally and get on with your inductive assertion style proofs?

The Basic Idea

So imagine you have an operational semantics given by some single step ``next state'' function. Let's call the function step. I imagine that the program being executed, along with the program counter, pc, is part of the state. The semantics of the language is given by iteratively applying step to the initial state.

Now suppose further that you have an annotation of the program that associates assertions with specific pcs. To make the process work, one should ``cut every loop'' with an assertion, i.e., ensure that the code cannot execute forever without encountering an assertion. In a simple, one-loop program, one generally annotates the entry pc with an assertion called the pre-condition, the top of the loop with a loop invariant, and the exit with a post-condition. You want to prove that if the pre-condition holds on entry and execution reaches the exit, then the post-condition holds.

The problem with proving this is that you have to (symbolically) execute an arbitrary number of steps to get from one assertion to the next, to know what to prove.

The way around this is to complete the assertions so that they are a step-wise invariant. To complete them, just define a predicate, Inv, that takes a state and does a case split on what the pc is. If the pc is one of those annotated, Inv just checks the corresponding assertion. If the pc is not one of those annotated, Inv(s) is defined to be Inv(step(s)).

If you try to prove that Inv(s) implies Inv(step(s)) -- that is, if you try to prove that Inv is invariant under step -- you will just generate the standard verification conditions as subgoals!

To learn more, read the paper!

References

Inductive Assertions and Operational Semantics, CHARME 2003, D. Geist (ed.), Springer Verlag LNCS 2860, pp. 289-303, 2003.
Inductive Assertion Style Proofs via Operational Semantics -- A talk given at IFIP WG2.2, Udine, Italy, Sep 13, 2006. This talk is clearer, I think, than the one I gave at CHARME.
ACL2 demo session log - this is the session log of the demonstration I gave. The log has two parts. First I just prove a classic Boyer-Moore theorem -- the associativity of append -- to remind the audience of the theorem prover's basic style. Second, I show a complex operational semantics for the JVM, as described in the talk and use it to verify the partial correctness of a simple program that computes half of an even natural (and runs forever on odd naturals).
Java session log - shows the trivial program we will verify. To see the JVM model compute on these same examples and then see the partial correctness proof, see the second half of the ACL2 demo session log above.