Email thread begun by Rob Arthan, Sat, 21 Aug 2010 12:06:02 +0100 I have a comment or two: Shankar's comments on warranties are probably right in practice, but I still endorse J's encouragement to think about warranties as a thought experiment, as it focusses the mind on probabilities: that's why I said it's like a bet. Whether it is possible to come up with a really meaningful probabilistic theory for software failure rates is a moot point, but safety engineers would really love it if someone did. If you don't like to think about probabilities, just think about cost: what commercial value might your proofs have? I am inclined to agree with Shankar that malicious subversion of assurance is a risk that can be mitigated in practice by processes agreed between proof suppliers and proof evaluators. If the two sides can't agree on a mutually acceptable process, then no-one will pay for a proof. For example, many years ago when the Formal Methods Unit at ICL were doing security proofs for GCHQ with Classic HOL, we were required to tie off loopholes or known bugs with trust implications. [From Roger 6 Sept.: Actually, this was our choice (probably mine), mainly motivated by the desire to dissuade evaluators from attempting to evaluate the proof scripts. In order to argue that inspecting proof scripts was unnecessary, we wanted it to be the case that there was no known way of proving a falsehood (and an argument that it could not be done). The case was more complex than that, but the requirement (for certification at the level sought) included that we formally verify the hardware, and the strategies for proof and for persuading the evaluators that we had accomplished the proof were ours.] E.g., we had to include in our scripts the INRIA ML equivalent of: exception Cheat; fun mk_thm _ = raise Cheat; so that we couldn't call the real mk_thm. (I was really popular at the time: I was trying to understand the Classic HOL code with a view to reimplementing it and that involved finding new bugs that had to be resolved while the rest of the team were trying to meet deadlines for shipping proofs). See the interview with Roger Jones in Donald MacKenzie's book Mechanizing Proof for more details on this. Also, although it may lead to near-fatal torpor, I would advise tool developers to have a look at some of the relevant standards. DO 178B, the FAA standard for software that flies recognised the existence of formal methods, but I don't believe any supplier ever took up the offer. The forthcoming new version DO 178C has a supplement intended to give realistic and practical guidelines for applying formal methods and this should be very relevant. > In particular, J Moore suggested that system developers contribute a > paragraph each, describing their approach to trust. So I hope that > the following invited speakers will each send me such a paragraph: - > Burkhart Wolff (Isabelle) - J Moore (ACL2) - Konrad Slind (HOL4) - > Laurent Thery (Coq) - Shankar (PVS) For the record, the ProofPower trust story is the LCF story, backed up by a formal specification (but no proof) of the HOL language, logic and critical properties of a proof tool. [From Roger 6 Sept.: In addition to the points you mention here and in the sequel, your formal treatment, and the implementation, extend the usual LCF story that only theorems are provable because the rules of the logic correspond to constructors in an abstract datatype to cover aspects of trust which relate to the management of the theory hierarchy.] The support for Z in ProofPower is an example of an extension in the sense of a semantic embedding of a new object language with a trust story piggy-backing on the trust story of ProofPower-HOL. The trust story for both ProofPower-HOL and ProofPower-Z includes the fact that the system is interactively programmable, so an evaluator who wants to can really dig in and pick syntax to pieces, which, in principle, mitigates the risk of problems of the sort recently discussed in Freek Wiedijk's (somewhat light-hearted) paper on what he calls "Pollack Consistency" see http://www.floc-conference.org/UITP-program-day195.html. In practice, the fact that you can easily code up things like reports on which definitions have been proved consistent and which have been given as axioms seems to be appreciated by commercial users. (Evaluators really like checklists!). Finally, while I agree with Shankar's suggestion that the report collapse the discussions into something more digestible, but as Matt took the trouble to type up the discussions more or less verbatim, I think it would be good to include his transcript as an an annex (so that anyone who dissented strongly with something in the summary has their views recorded). Regards, Rob.