Code Verification for Practical Machine Architectures

This is the project page for the DARPA CRASH effort at the University of Texas at Austin.

People

Cuong Kim Chau
Cuong Kim Chau
(PhD Student)
Shilpi Goel
Shilpi Goel
(PhD Student)
Marijn Heule
Marijn J. H. Heule
(Research Scientist)
Warren A. Hunt, Jr.
Warren A. Hunt, Jr.

(Professor, PI)
Matt Kaufmann
Matt Kaufmann
(Senior Research Scientist)
Keshav Kini
Keshav Kini
(PhD Student)
J Moore
J Moore
(Professor)
Ben Selfridge
Ben Selfridge
(PhD Student)
Nathan Wetzler
Nathan Wetzler
(PhD Candidate)

Previous Members of the CRASH Team:

Natahlie Beavers
Natahlie Beavers
(Undergraduate Student)
Soumava Ghosh
Soumava Ghosh
(Masters Student)
Robert Krug
Robert Krug
(Research Engineering/
Scientist Associate IV)
Corbyn Salisbury
Corbyn Salisbury
(Undergraduate Student)
Théo Zimmermann
(CS Graduate Student)

Problem Description

With pervasive use of software artifacts in every aspect of human life, the cost of malfunctioning software today is staggering. Formal methods have shown significant promise to improve software reliability dramatically. However, deductive or theorem proving solutions, based on human-guided, semi-automatic mechanical reasoning often involve prohibitive manual effort, while fully automatic approaches based on decision procedures can quickly run out of available time and memory. Better tool suites and methodologies are needed in order to achieve the promise of formal methods for achieving correct, resilient computing systems. This project focuses on the creation of formal tools for code verification, in particular for x86 machine code, with a related goal of improving the ACL2 theorem proving environment.

Research Goals

  1. Develop an executable, formal specification for a significant subset of the x86 instruction set architecture (ISA). Our focus is on the 64-bit mode of Intel's IA-32e mode. Our current version of the model has a simulation speed of ~3.3 million instructions/second in its programmer-level mode (where we operate at the level of linear memory) and and ~920,000 instructions/second with a two-level page table configuration in its system-level mode (where we operate at the level of physical memory). We support almost 120 general-purpose instructions (221 opcodes) and 80 SSE/SSE2 instructions.
  2. Construct a mechanized proof assistant for x86 binary code. The development of a formal specification, discussed above, is a critical step towards reaching this goal. The efficient executability of that specification is also critical, in order to validate the model by co-simulation against actual x86 execution. We will use the ACL2 theorem prover to carry out code proofs, both using BDD/SAT-based symbolic simulation and by developing methods based on general-purpose theorem proving techniques, especially rewriting and induction.
  3. Improve the ACL2 theorem proving environment. This is a long-term goal that our research group continues to pursue. ACL2 is our core formal verification technology, so improvements to ACL2 support the research goals discussed above, such as the ``abstract stobjs'' feature.
  4. Develop tools to gain confidence in satisfiability (SAT) solvers. Researchers develop high performance applications called SAT solvers to efficiently solve satisfiability problems, and any problem that can be encoded into a Boolean formula can be investigated with SAT solvers. It is easy to check a solution to a satisfiability problem if one can be found. However, if a solver outputs that a formula is unsatisfiable, we must trust that the solver exhausted all possible assignments. Existing verification tools cannot verify all techniques used in state-of-the-art solvers.
  5. Create automated, sophisticated satisfiability encodings with respect to software verification. The encoding of problems, such as software verification, into Boolean formulas has a huge impact on the performance of SAT solvers. We will develop techniques that, given a certain high-level description, will make a high quality translation of the problem into a Boolean formula.

Technical Approach

Our technical approach to x86 code verification is based on building a formal interpreter-style model for the x86 instruction set architecture (ISA) in the language of the ACL2 theorem prover. By making our model efficient to execute on concrete data, we are validating our model by co-simulation against execution on an actual x86 processor.

Our previous work suggests that we can build ACL2 proof infrastructure on top of that model, though the complexity of the x86 architecture presents new challenges. In support of this work, we will continue to improve the ACL2 proof environment. We will also employ symbolic simulation, which is an effective approach to software verification, and develop capabilities to apply, in a sound way, BDDs and SAT techniques, which together are the most used techniques for symbolic simulation.

Dissemination of Research Results

We intend to make our x86 ISA specification and associated proof technology available from this page by the end of 2014 (sooner upon request from DARPA). A much simpler model is based on the y86 ISA, developed in the textbook Computer Systems: A Programmer's Perspective, 2/E by Bryant and O'Hallaron (in particular, Chapter 4). We provide here an early version of our y86 model; for a recent version, see directory models/y86/ in the ACL2 Community Books.

SOFTWARE

The ACL2 theorem proving system has been publicly available for about two decades, and is currently available from the ACL2 home page. It is in regular use at several companies, including AMD, Centaur Technology (VIA Technology), IBM, and Rockwell Collins. It has and continues to be used by U.S. government personnel, some of whom have expressed interest in our x86 model in particular.

Visit the ACL2 home page to obtain the latest version of ACL2 and to view that system's documentation.

PUBLICATIONS

Authors from our CRASH group at UT Austin are shown in bold font.