Search-engine friendly clone of the ACL2 documentation.

X86isa-state

X86isa-state-history

A short history of the definition of the x86 state

Old definition of the x86isa state

Before bigmems::bigmems and rstobj2::defrstobj were used to define the x86 state, the x86 state's definition was rather tedious. (For future reference, the following git revision has that old definition: dea40263247bd930077205526934bc596686bfb0).

This current file state.lisp replaces the following old files, all of which are deleted now.

concrete-state.lisp
concrete-memory.lisp
abstract-state.lisp
state-field-thms.lisp

The old x86isa memory model was based on the FMCAD'2012 paper "A Formal Model of a Large Memory that Supports Efficient Execution". The following comment from the old file concrete-state.lisp describes it briefly:

.... 
;; For our ACL2 model, we define a paging-like mechanism to 
;; model the complete x86 52-bit address space.  The memory is 
;; laid out in a flat array, to be viewed as back-to-back 
;; "pseudo-pages" each of size 2^27 bytes (128MB).  The address 
;; of a byte is split into two pieces: a 25-bit pseudo-page 
;; address and a 27-bit offset within a page.  The mem-table 
;; data structure is of size *mem-table-size* = 2^25 -- thus, 
;; indices are 25 bits -- and it maps these indices to 25-bit 
;; pseudo-page addresses.  However, the mem-table values are 
;; actually 26-bit values: the least significant bit is 
;; initially 1, but is 0 when the entry is valid, in which case 
;; the most significant 25 bits represent a pseudo-page address. 
;; The offset of a byte address is a 27-bit wide address, which 
;; when added to (ash pseudo-page-address 27), gives the "real" 
;; address of a byte stored in mem-array.  Mem-array-next-addr 
;; keeps track of the 25-bit index of the pseudo-page to be 
;; allocated next. 
 
;; Here is an example of how this works.  Suppose we have the 
;; following, where again, the least significant bit of 0 
;; indicates a valid entry, and where {i, 1'bx} denotes 
;; concatenation of the bit-vector i with the single bit x. 
 
;;   mem-table[#x0654321] = {0, 1'b0} 
;;   mem-table[#x16789ab] = {1, 1'b0} 
;;   mem-table[#x0111111] = {2, 1'b0} 
 
;; All additional values in mem-table are the initial value of 
;; 1, which means "page is not present". 
 
;; Then mem-array starts as follows. 
 
;;  2^27 bytes corresponding to byte addresses with top 25 bits = #x0654321 
;;  2^27 bytes corresponding to byte addresses with top 25 bits = #x16789ab 
;;  2^27 bytes corresponding to byte addresses with top 25 bits = #x0111111 
 
(mem-table :type (array (unsigned-byte #.*mem-table-size-bits+1*) 
                        ;; *mem-table-size-bits* of pseudo-page 
                        ;; address, followed by 1 extra bit 
                        ;; (LSB) to indicate the 
                        ;; absence/presence of pseudo-pages 
                        (#.*mem-table-size*)) 
           :initially 1 
           :resizable nil) 
 
(mem-array :type (array (unsigned-byte 8) 
                        (#.*initial-mem-array-length*)) 
           :initially 0 
           :resizable t) 
 
(mem-array-next-addr :type (integer 0 #.*mem-table-size*) 
                     :initially 0) 
 
...

The proof of correctness of this memory model was pretty involved (see the old file concrete-memory.lisp). Then there was the overhead of introducing an abstract stobj to logically view this memory as a record (see old file abstract-stobj.lisp).

All of this was finished before nested stobjs were supported in ACL2, so there was not much incentive to simplify the memory model or the state definition.

Non-interference Properties

We used a neat trick for all the accessors and updaters of the x86isa state's field that is worth mentioning here. Say RGFI* is the name of the corresponding exported accessor from defabsstobj; during execution, RGFI* calls are really RGF$CI calls, the concrete accessor from the corresponding concrete stobj.

 Signature: (RGFI* index x86)

We also defined a universal accessor function XR, meant to be used only for reasoning:

 Signature: (XR fld-name index x86)

XR case-split on fld-name and called the appropriate accessor function (e.g., if fld-name was :RGF, XR called RGFI*, and so on). We defined a similar universal updater function XW. We then proved read-over-write/write-over-write, etc. lemmas about just XR and XW, thereby getting these properties for all fields of the x86 state. This way, we avoid a blow-up in the number of such lemmas as the number of fields in the state increase.

We then defined another function, RGFI (inlined), with the same signature as RGFI*, whose body was an MBE like so:

(RGFI index x86) := 
 (mbe :logic (XR :RGF index x86) 
      :exec (RGFI* index x86))

We then used RGFI as the canonical accessor function for the RGF field --- we never used RGFI* or XR in our definitions from this point on, though XR was often used in definitions where we didn't care about the execution speed (e.g., non-executable functions used only for reasoning). We kept RGFI enabled and XR disabled.

The consequence of this was that whenever an RGFI call was encountered during reasoning, it quickly opened up to XR (about which we have all those nice theorems). During execution, RGFI was simply the efficient concrete accessor RGF$CI.