Code Verification for Practical Machine Architectures
This is the project page for the DARPA CRASH effort at the University
of Texas at Austin.
People
Recent additions to the group:
Cuong Kim Chau
(CS Graduate Student) |
Keshav Kini
(CS Graduate Student) |
Corbyn Salisbury
(CS Undergraduate Student) |
Théo Zimmermann
(CS Graduate Student) |
Problem Description
With pervasive use of software artifacts in every aspect of human
life, the cost of malfunctioning software today is staggering. Formal
methods have shown significant promise to improve software reliability
dramatically. However, deductive or theorem proving solutions, based
on human-guided, semi-automatic mechanical reasoning often involve
prohibitive manual effort, while fully automatic approaches based on
decision procedures can quickly run out of available time and memory.
Better tool suites and methodologies are needed in order to achieve
the promise of formal methods for achieving correct, resilient
computing systems. This project focuses on the creation of formal
tools for code verification, in particular for x86 machine code, with
a related goal of improving the ACL2 theorem proving environment.
Research Goals
- Develop an executable, formal specification for a significant
subset of the x86 instruction set architecture (ISA). Our current
version of the model has a simulation speed of ~2.4 million
instructions/second when paging is disabled and and ~560,000
instructions/second with a four-level page table configuration. We
are working on further improving performance of the memory management
implementation of our model.
- Construct a mechanized proof assistant for x86 binary code. The
development of a formal specification, discussed above, is a critical
step towards reaching this goal. The efficient executability of that
specification is also critical, in order to validate the model by
co-simulation against actual x86 execution. We will use
the ACL2
theorem prover to carry out code proofs, both using BDD/SAT-based
symbolic simulation and by developing methods based on general-purpose
theorem proving techniques, especially rewriting and induction.
- Improve
the ACL2
theorem proving environment. This is a long-term goal that our
research group continues to pursue. ACL2 is our core formal
verification technology, so improvements to ACL2 support the research
goals discussed above, such as the
``abstract
stobjs'' feature.
- Develop tools to gain confidence in satisfiability (SAT) solvers.
Researchers develop high performance applications called SAT solvers
to efficiently solve satisfiability problems, and any problem that can
be encoded into a Boolean formula can be investigated with SAT
solvers. It is easy to check a solution to a satisfiability problem
if one can be found. However, if a solver outputs that a formula is
unsatisfiable, we must trust that the solver exhausted all possible
assignments. Existing verification tools cannot verify all techniques
used in state-of-the-art solvers.
- Create automated, sophisticated satisfiability encodings with
respect to software verification. The encoding of problems, such as
software verification, into Boolean formulas has a huge impact on the
performance of SAT solvers. We will develop techniques that, given a
certain high-level description, will make a high quality translation
of the problem into a Boolean formula.
Technical Approach
Our technical approach to x86 code verification is based on building a
formal interpreter-style model for the x86 instruction set
architecture (ISA) in the language of
the ACL2 theorem
prover. By making our model efficient to execute on concrete
data, we are validating our model by co-simulation against execution
on an actual x86 processor.
Our previous work suggests that we can build ACL2 proof infrastructure
on top of that model, though the complexity of the x86 architecture
presents new challenges. In support of this work, we will continue to
improve the ACL2 proof environment. We will also employ symbolic
simulation, which is an effective approach to software verification,
and develop capabilities to apply, in a sound way, BDDs and SAT
techniques, which together are the most used techniques for symbolic
simulation.
Dissemination of Research Results
We intend to make our x86 ISA specification and associated proof
technology available from this page as it matures. A much simpler
model is based on the y86 ISA, developed in the textbook Computer Systems: A Programmer's
Perspective, 2/E by Bryant and O'Hallaron (in particular, Chapter 4).
We provide here an early version of our y86
model; for a recent version, see
directory models/y86/
in the ACL2 Community
Books.
SOFTWARE
The ACL2 theorem proving system has been publicly available for about
two decades, and is currently available
from the ACL2
home page. It is in regular use at several companies, including
AMD, Centaur Technology (VIA Technology), IBM, and Rockwell Collins.
It has and continues to be used by U.S. government personnel, some of
whom have expressed interest in our x86 model in particular.
Visit the ACL2
home page to obtain the latest version of ACL2.
PUBLICATIONS
Authors from our CRASH group at UT Austin are shown in bold
font.
- Blocked Clause Decomposition.
Marijn J. H. Heule and Armin Biere.
LPAR-19, Logic for Programming Artificial Intelligence and Reasoning
(to appear).
- Trimming while Checking Clausal Proofs.
Marijn J. H. Heule, Warren A. Hunt, Jr., and Nathan Wetzler.
FMCAD 2013, 13th Conference on Formal Methods in Computer-Aided Design.
- Bridging the Gap Between Easy Generation and Efficient Verification of Unsatisfiability Proofs.
Marijn J. H. Heule, Warren A. Hunt Jr., and Nathan Wetzler.
Software Testing, Verification and Reliability (to appear).
- Guided Merging of Sequence Diagrams.
Magdalena Widl, Armin Biere, Petra Brosch, Uwe Egly, Marijn Heule, Gerti Kappel, Martina Seidl, and Hans Tompits.
Conference on Software Language Engineering 2012.
- Concurrent Cube-and-Conquer.
Peter van der Tak, Marijn J.H. Heule, and Armin Biere.
Pragmatics of SAT 2012.
- A Formal Model of a Large Memory that Supports Efficient Execution.
Warren A. Hunt, Jr. and Matt Kaufmann.
Formal Methods in Computer-Aided Design 2012.
- Towards a Formal Model of the X86 ISA.
Warren A. Hunt, Jr. and Matt Kaufmann.
Technical Report, Dept. of Computer Science, University of Texas at Austin.
- Automated Reencoding of Boolean Formulas.
Norbert Manthey, Marijn J. H. Heule and Armin Biere.
Haifa Verification Conference 2012.
- Revisiting Hyper Binary Resolution.
Marijn J. H. Heule, Matti Jarvisalo, and Armin Biere.
Conference on Integration of Artificial Intelligence and Operations Research techniques in Constraint Programming (2013).
- Enhancements to ACL2 in Versions 5.0, 6.0, and 6.1.
Matt Kaufmann and J Strother Moore.
ACL2 Workshop 2013, Electronic Proceedings in Theoretical Computer Science.
- A Parallelized Theorem Prover for a Logic with Parallel Execution.
David L. Rager (now at Oracle), Warren A. Hunt, Jr., and Matt Kaufmann.
ITP 2013, 4th Conference on Interactive Theorem Proving.
- Abstract Stobjs and Their Application to ISA Modeling.
Shilpi Goel, Warren A. Hunt, Jr., and Matt Kaufmann.
ACL2 Workshop 2013, Electronic Proceedings in Theoretical Computer Science.
- Automated Code Proofs on a Formal Model of the X86
Shilpi Goel and Warren A. Hunt, Jr.
VSTTE 2013 (Verified Software: Theories, Tools, and Experiments).
- Verifying Refutations with Extended Resolution.
Marijn J. H. Heule, Warren A. Hunt, Jr., and Nathan Wetzler.
Conference on Automated Deduction 2013.
- Mechanical Verification of SAT Refutations with Extended Resolution.
Nathan Wetzler, Marijn J. H. Heule, and Warren A. Hunt, Jr.
ITP 2013, 4th Conference on Interactive Theorem Proving.
- A SAT Approach to Clique-Width.
Marijn J. H. Heule and Stefan Szeider.
International Conference on Theory and Applications of Satisfiability Testing (2013).