Randy Johnson Position Statement

D. Randolph Johnson Position Statement

Senior Mathematician, National Security Agency, Arnold, MD USA
<drj@tarius.tycho.ncsc.mil>

Let me start by seconding some points that others have made. Please forgive me for saying some things that are blindingly obvious. First of all, I agree that formal methods should be integrated throughout the curriculum rather than isolated in one course.

The most basic ideas are so basic that they should be introduced early and repeated enough that they become second nature. The points I would include here are the obvious ones that it is good to think carefully about what you are doing, that clear specifications are helpful in writing good software, that preconditions, postconditions, and invariants are good thinks to think about, as are exceptions, error conditions and boundary conditions. These statements can, of course, be written in a natural language, but a suitable formal language and some discrete mathematics can add greatly to the precision and clarity of the statement.

I would recommend introducing at least one formal notation in a first course and then asking the students to use it in later courses. An algorithms course, for example, is full of things that students can be asked to express formally. I believe that practice in expressing properties of algorithms (or data structures, or systems, or ...) is of great value.

One of the princlples that becomes clear very soon is the value of abstraction. You really want to focus on the important aspects of the situation and omit unimportant details. Unfortunately, they don't come with labels telling you one from the other. I don't know of any magic way to make students good at writing appropriate abstract descriptions. People do get better with practice and feedback from an instructor can be very helpful. As Peter Gorm Larsen suggested, some industrial scale examples can be quite helpful along with the examples which arise naturally in their other courses.

The difficulties and the benefits of writing descriptions in a precise, unambiguous formal notation probably become clear rather early. More subtle benefits probably won't be seen at the beginning but will come out with more experience, especially on real world examples. I believe that the formal notation and vocabulary can add structure to the whole exercise in a way that natural language does not. The formalism can suggest questions and issues that might not even be considered otherwise. This capability to suggest the right questions is sometimes more valuable that the precision provided by the notation.

What I am stressing here is the value of lots of practice in expressing properties of various computing artifacts logically, in a formal notation. I was struck by the anecdote that David Garlan relates in his paper "Effective formal methods education for professional software engineers" in the book Teaching and Learning Formal Methods edited by Neville and Mike Hinchey. See pages 19-20 in which an older student describes his astonishment at finding himself using Z voluntarily and at finding it helpful.

My own choice and recommendation for a first formal notation is Z. I think that it is a good general purpose language usable at various levels of abstraction and for a wide range of problems. Most people with whom I have worked can start almost immediately using the elementary parts of Z and then pick up the more obscure and subtle aspects as the need arises. In many cases, the need never does arise. In other words, when the people involved in a project include some with little knowledge of Z (the usual situation in my work) it is sometimes better to express something using only elementary Z rather than in a more elegant way. In addition, because Z is so widely used, there is large amount of support materials available, including texts and published examples of specifications. Furthermore, good support tools for Z are available (see my comments below).

I think that tool support for formal methods is essential. I have used various Z tools, most recently version 1.5 and version 2.0 of Z/EVES. See Dan Craigen's position paper for more information on Z/EVES. Among its strong points are that it runs on a variety of platforms and is free for academic use. In my experience, the biggest drawback of version 1.5, at least for student use, is that you had to know LaTeX and emacs in addition to learning formal methods and Z. Version 2.0 added a GUI and eliminated the need to know LaTeX and emacs. This is MUCH better for beginners. There are some rough edges in version 2.0, but most of them should be fixed in version 2.1. Depending on when Dan and his colleagues release it to the world, I may be able to send some early user comments on 2.1 to Helsinki.

For educational use, maybe the biggest value of Z/EVES (and many other formal methods tools) is that it does syntax checking and type checking at the push of a button. Some students may never do much more than that. A nice feature of Z/EVES which I haven't seen in other tools is that it goes beyond type checking to automatically generate domain conditions. These state that the argument to which a partial function is applied in a spec is actually in the domain of the function. Not only does it generate the conditions, it tries and often succeeds in proving them. With no effort on the part of the instructor, the tool repeatedly draws the attention of the student to an aspect of their specification which is often ignored, even by very experienced Z specifiers.

At some opportune point in the curriculum, I think that students should be introduced to temporal logics and/or process algebras and model checkers so that they see some formal approaches to handling situations for which Z is not very well suited. It is common for students in computer courses to learn about a variety of programming languages and to learn that different languages and development methods are suited for different projects. A GUI is different from a web page which is different from an embedded automatic brake controller. Ideally in an undergraduate course and certainly in a graduate course, a student should also learn about a variety of formal notations and methods. Which one(s) to use depends upon the nature of the problem. We don't want a student to come out as a carpenter with only a hammer in his toolbox, so that every problem looks like a nail.

Another broad aspect of formal methods which students should understand is the economic dimension. This is becoming increasingly significant as formal methods move from academic theory and research prototypes into serious commercial practice. If we use the terminology that engineering = science + economics, I consider formal methods to be part of computer engineering, not computer science. Formal methods are based in mathematics and the technical aspects are clearly scientific. I think that the techniques chosen for a given problem should be efficient as well as effective. I think that the degree to which formal methods should be used is almost entire governed by economics. In many situations at least writing a formal description of a system or critical parts of a system will be worthwhile. In some cases, careful logical analysis using the formalism will be cost effective. For critical systems, for example, those with serious safety, security, or financial risks involved, full proofs in a theorem prover are justified. The general point is that the use of formal methods is not all or nothing. Formalisms can be used in varying degrees and the decision on how much to do should governed by the cost and benefits involved.