CS378 A Formal Model of the Java Virtual Machine

Spring, 2012

Important Note: The tests are half of your grade and occur just before Spring Break and the end of semester. Do not miss the tests by leaving campus early!

Assignments and Supplemental Material

Summary

We will study a formal specification of the Java Virtual Machine (JVM). The JVM is a stack-based, object-oriented, type-safe bytecode (assembly language) interpreter on which compiled Java programs are executed.

But the focus of the course will be on teaching you how to formalize a comparably complicated computing artifact and how to subsequently use that formalization. That is, we will be more interested in formalization techniques than in the JVM specifically.

You will learn about four different things: how to make a mathematical model of a complicated digital artifact like the JVM, how to program in a simple functional language, how to reason about such models, and how to use a powerful automatic reasoning tool.

Textbooks

Other Useful Resources

Grades

You will note that 105% is accounted for above. The extra 5% may be considered “slack points” so that, for example, you may miss several classes and still make a perfect 100%.

Extra Credit: Extra credit will be given for projects presented at the end of the semester. Possibilities for projects will be discussed from time to time in class. If you have a project proposal, discuss it with me before you invest time in it. You may work with others on projects.

Pre-Requisites

Upper-division standing is required for all CS378 classes.

If the question is “What do I have to know in order to do well in this course?” as opposed to “What are the university rules?” the answer is: mathematical logic, including induction, and some experience programming in some language, preferably Java. You should be able to use Eclipse or Emacs. Experience with Lisp or ACL2 is helpful but the subset we use is relatively small and will be taught (quickly).

Tools

I will teach you how to define and run programs in the ACL2 programming language and to use the ACL2 theorem prover. You may use either the Eclipse or Emacs interface to ACL2. Both are available on the CS Department's public machines. You may also wish to install ACL2, Emacs, and/or Eclipse on your own machine.

See How to Use ACL2s to get started.

Lecture and Discussion Schedule

We'll approach the JVM model incrementally, starting with a very simple (suggestive but inaccurate) model. Then we will extend and revise it repeatedly toward a more accurate description of the JVM. We'll learn the necessary functional programming and proof techniques by building the simplest model. Most of the semester will be spent extending and exploring more elaborate models.

You will be expected to do much of the formalization work here and extra-credit project ideas may come out our discussions. For example, good projects might include the formalization or elaboration of features not dicussed in class or the mechanized proofs of some of the properties discussed.

We will adhere pretty closely to the following sequence of topics. But since many classes will be presentations by students in answer to questions raised by the instructor, the pace may vary somewhat.

All dates below speculative.

Wed, Jan 18Introduction
Mon, Jan 23building M1 -- functional programming in ACL2
Wed, Jan 25building M1 -- functional programming in ACL2
Mon, Jan 30building M1 -- functional programming in ACL2
Wed, Feb   1reasoning about M1 “by hand”
Mon, Feb   6reasoning about M1 “by hand”
Wed, Feb   8quick introduction to how ACL2's prover works
Mon, Feb 13mechanized proofs about M1
Wed, Feb 15mechanized proofs about M1
Mon, Feb 20mechanized proofs about M1
Wed, Feb 22mechanized proofs about M1
Mon, Feb 27the class table, the heap, and threads
Wed, Feb 29macros for managing an elaborate state
Mon, Mar   5M5 — a fairly realistic JVM model
Wed, Mar   7Midterm Test
Mon, Mar 12Spring Break
Wed, Mar 14Spring Break
Mon, Mar 19object creation and manipulation
Wed, Mar 21method resolution and invocation
Mon, Mar 26threads and monitors
Wed, Mar 28M5
Mon, Apr   2mechanized proofs about M5
Wed, Apr   4mechanized proofs about M5
Mon, Apr   9mechanized proofs about M5
Wed, Apr 11mechanized proofs about M5
Mon, Apr 16extending M5
Wed, Apr 18extending M5
Mon, Apr 23extending M5
Wed, Apr 25M6—an accurate JVM model
Mon, Apr 30M6—an accurate JVM model
Wed, May   2Last Test

More about Course Content

A mathematical logic is a formal system consisting of a precisely defined syntax, some axioms, and some rules of inference. The axioms are just formulas in the syntax — formulas that are taken to be ``always true.'' The rules of inference are formula transformers that preserve truth. A theorem is a formula that can be derived from the axioms by applying the rules of inference. A theorem is thus ``always true.'' By modeling a computing system in a mathematical logic we can prove theorems about it to establish its properties.

You studied formal mathematical logic in CS313K and in CS336. There you learned propositional calculus as a formal system. You also learned first order predicate calculus. You might have also learned set theory. So which mathematical logic do we use to describe the Java Virtual Machine?

The mathematical logic we use is a functional programming language, Pure Lisp. If you know anything at all about Lisp, you probably think of it as merely a programming language. But we cast it as a logic, with a precisely given syntax, some axioms, and some rules of inference. We will prove theorems in Lisp.

Put another way, in this course you will come to understand the JVM by studying a model of the Java Virtual Machine written in a funtional programming language.

We will cover representatives of most of the JVM byte codes, including IADD, ILOAD, ISTORE, IFGT, GOTO, NEW, PUTFIELD, INVOKEVIRTUAL, and MONITORENTER. We will not cover the entire JVM — for example, we will not deal with the details of arithmetic, arrays, class loading, or native methods. However, by the end of this course you will be able to write formal specifications of many of the omitted parts.

We will discuss the Java bytecode verifier; in particular, we will investigate its specification: what properties should it have?

The logic we use is supported by a mechanical theorem prover, ACL2. This theorem prover is in use in industry to verify properties of hardware, microcode, and software. In fact, its authors won the 2005 ACM Software System Award for the lasting influence their theorem provers have had on computer science.

This course is an unusual mixture of many CS courses. It is like CS307 in that we will be dealing with the Java programming language. It is like CS310 in that we will be looking at an assembly level language. It is like CS352 in that we will be considering the architectural features of the processor. It is like CS372 in that we will be considering process management, memory management, protection, thread scheduling, and concurrency. It is like CS313K in that we will be dealing with a formal logic. It is like CS336 in that we will be formally modeling and proving theorems about our programs and algorithms. It is like parts of CS343 in that we will be discussing mechanized reasoning.

Other Administrative Matters

Religious Holy Days: A student who is absent from an examination or cannot meet an assignment deadline due to the observance of a religious holy day may take the examination on an alternate day, submit the assignment up to 24 hours late without penalty, or be excused from the examination or assignment, if proper notice of the planned absence has been given. Notice must be given at least fourteen days prior to the classes scheduled on dates the student will be absent. For religious holy days that fall within the first two weeks of the semester, notice should be given on the first day of the semester. It must be personally delivered to the instructor and signed and dated by the instructor, or sent via certified mail, return receipt requested. Email notification will be accepted if received, but a student submitting such notification must receive email confirmation from the instructor. A student who fails to complete missed work within the time allowed will be subject to the normal academic penalties.

Disability Related Needs: Please notify me of any modification/adaptation you may require to accommodate a disability-related need. You will be requested to provide documentation to the Office of the Dean of Students in order that the most appropriate accommodations can be determined. Specialized services are available on campus through Services for Students with Disabilities, SSB 4th floor, A5800, 471-6259, TTY 471-4641

Emergencies and Illness: Documented emergencies and illnesses will be dealt with by the instructor. For best results, communicate with me before you miss a midterm or the final and be prepared to supply written, verifiable evidence of the condition.

Code of Conduct: For important other advice about expectations and conduct, see The Computer Sciences Department Rules to Live By.