Code Verification for Practical Machine Architectures

This is the project page for the DARPA CRASH effort at the University of Texas at Austin.

People

Natahlie Beavers
Natahlie Beavers
(CS Undergraduate Student)
Soumava Ghosh
Soumava Ghosh
(CS Graduate Student)
Shilpi Goel
Shilpi Goel
(CS Graduate Student)
Marijn Heule
Marijn J. H. Heule
(Research Fellow)
Warren A. Hunt, Jr.
Warren A. Hunt, Jr.

(Professor, PI)
Matt Kaufmann
Matt Kaufmann
(Senior Research Scientist)
Robert Krug
Robert Krug
(Research Engineering/
Scientist Associate IV)
J Moore
J Moore
(Professor)
Ben Selfridge
Ben Selfridge
(CS Graduate Student)
Nathan Wetzler
Nathan Wetzler
(CS Graduate Student)

Recent additions to the group:

Cuong Kim Chau
Cuong Kim Chau
(CS Graduate Student)
Keshav Kini
Keshav Kini
(CS Graduate Student)
Corbyn Salisbury
Corbyn Salisbury
(CS Undergraduate Student)
Théo Zimmermann
(CS Graduate Student)

Problem Description

With pervasive use of software artifacts in every aspect of human life, the cost of malfunctioning software today is staggering. Formal methods have shown significant promise to improve software reliability dramatically. However, deductive or theorem proving solutions, based on human-guided, semi-automatic mechanical reasoning often involve prohibitive manual effort, while fully automatic approaches based on decision procedures can quickly run out of available time and memory. Better tool suites and methodologies are needed in order to achieve the promise of formal methods for achieving correct, resilient computing systems. This project focuses on the creation of formal tools for code verification, in particular for x86 machine code, with a related goal of improving the ACL2 theorem proving environment.

Research Goals

  1. Develop an executable, formal specification for a significant subset of the x86 instruction set architecture (ISA). Our current version of the model has a simulation speed of ~2.4 million instructions/second when paging is disabled and and ~560,000 instructions/second with a four-level page table configuration. We are working on further improving performance of the memory management implementation of our model.
  2. Construct a mechanized proof assistant for x86 binary code. The development of a formal specification, discussed above, is a critical step towards reaching this goal. The efficient executability of that specification is also critical, in order to validate the model by co-simulation against actual x86 execution. We will use the ACL2 theorem prover to carry out code proofs, both using BDD/SAT-based symbolic simulation and by developing methods based on general-purpose theorem proving techniques, especially rewriting and induction.
  3. Improve the ACL2 theorem proving environment. This is a long-term goal that our research group continues to pursue. ACL2 is our core formal verification technology, so improvements to ACL2 support the research goals discussed above, such as the ``abstract stobjs'' feature.
  4. Develop tools to gain confidence in satisfiability (SAT) solvers. Researchers develop high performance applications called SAT solvers to efficiently solve satisfiability problems, and any problem that can be encoded into a Boolean formula can be investigated with SAT solvers. It is easy to check a solution to a satisfiability problem if one can be found. However, if a solver outputs that a formula is unsatisfiable, we must trust that the solver exhausted all possible assignments. Existing verification tools cannot verify all techniques used in state-of-the-art solvers.
  5. Create automated, sophisticated satisfiability encodings with respect to software verification. The encoding of problems, such as software verification, into Boolean formulas has a huge impact on the performance of SAT solvers. We will develop techniques that, given a certain high-level description, will make a high quality translation of the problem into a Boolean formula.

Technical Approach

Our technical approach to x86 code verification is based on building a formal interpreter-style model for the x86 instruction set architecture (ISA) in the language of the ACL2 theorem prover. By making our model efficient to execute on concrete data, we are validating our model by co-simulation against execution on an actual x86 processor.

Our previous work suggests that we can build ACL2 proof infrastructure on top of that model, though the complexity of the x86 architecture presents new challenges. In support of this work, we will continue to improve the ACL2 proof environment. We will also employ symbolic simulation, which is an effective approach to software verification, and develop capabilities to apply, in a sound way, BDDs and SAT techniques, which together are the most used techniques for symbolic simulation.

Dissemination of Research Results

We intend to make our x86 ISA specification and associated proof technology available from this page as it matures. A much simpler model is based on the y86 ISA, developed in the textbook Computer Systems: A Programmer's Perspective, 2/E by Bryant and O'Hallaron (in particular, Chapter 4). We provide here an early version of our y86 model; for a recent version, see directory models/y86/ in the ACL2 Community Books.

SOFTWARE

The ACL2 theorem proving system has been publicly available for about two decades, and is currently available from the ACL2 home page. It is in regular use at several companies, including AMD, Centaur Technology (VIA Technology), IBM, and Rockwell Collins. It has and continues to be used by U.S. government personnel, some of whom have expressed interest in our x86 model in particular.

Visit the ACL2 home page to obtain the latest version of ACL2.

PUBLICATIONS

Authors from our CRASH group at UT Austin are shown in bold font.