• Top
    • Documentation
    • Books
    • Boolean-reasoning
    • Projects
    • Debugging
    • Std
    • Proof-automation
    • Macro-libraries
    • ACL2
    • Interfacing-tools
    • Hardware-verification
    • Software-verification
      • Kestrel-books
        • Crypto-hdwallet
        • Apt
        • Error-checking
        • Fty-extensions
        • Isar
        • Kestrel-utilities
        • Set
        • Soft
        • C
          • Syntax-for-tools
          • Atc
            • Atc-implementation
            • Atc-tutorial
          • Language
          • Representation
          • Transformation-tools
          • Insertion-sort
          • Pack
        • Bv
        • Imp-language
        • Event-macros
        • Java
        • Bitcoin
        • Ethereum
        • Yul
        • Zcash
        • ACL2-programming-language
        • Prime-fields
        • Json
        • Syntheto
        • File-io-light
        • Cryptography
        • Number-theory
        • Lists-light
        • Axe
        • Builtins
        • Solidity
        • Helpers
        • Htclient
        • Typed-lists-light
        • Arithmetic-light
      • X86isa
      • Axe
      • Execloader
    • Math
    • Testing-utilities
  • C

Atc

ATC (ACL2 To C), a proof-generating C code generator for ACL2.

Introduction

This manual page contains user-level reference documentation for ATC. Users who are new to ATC should start with the tutorial, which provides user-level pedagogical information on how ATC works and how to use ATC effectively.

This manual page refers to the official C standard in the manner explained in the top-level XDOC topic of our C library.

General Form

(atc t1 ... tp
     :output-dir      ...  ; default "."
     :file-name       ...  ; no default
     :header          ...  ; default nil
     :pretty-printing ...  ; default nil
     :proofs          ...  ; default t
     :const-name      ...  ; default :auto
     :print           ...  ; default :result
  )

Inputs

t1, ..., tp

Names of ACL2 targets to translate to C.

Each ti must be a symbol that satisfies exactly one of the following conditions:

  • It is the symbol of an ACL2 function.
  • Its symbol-name is the name of a C structure type shallowly embedded in ACL2, introduced via defstruct.
  • Its symbol-name is the name of a C external object shallowly embedded in ACL2, introduced via defobject.

It is an error if two or more of the above conditions hold, because in that case the target does not unambiguously identify a function or a structure type or an external object.

The symbol-names of the targets must be all distinct. Even though C has different name spaces for (i) structure types and (ii) functions and external objects [C17:6.2.3], in C code generated by ATC these are treated as if they were in the same name space, so they must all have distinct names.

There must be at least one target function.

Each recursive target function must be called by at least one recursive or non-recursive target function that follows it in the list (t1 ... tp).

Each target function must satisfy the conditions discussed in the section `Representation of C Code in ACL2'.

:output-dir — default "."

Path of the directory where the generated C code goes.

This must be an ACL2 string that is a valid path to an existing directory in the file system; the path may be absolute, or relative to the connected book directory. The default is the connected book directory.

:file-name — no default

Name of the files that contain the generated C code, without the .h or .c extension.

This must be a non-empty ACL2 string that consists of ASCII letters, digits, underscores, and dashes. The full names of the generated files are obtained by appending the extension .h or .c to this string.

The files are generated in the directory specified by :output-dir. The files may or may not exist: if they do not exist, they are created; if they exist, they are overwritten.

:header — default nil

Specifies whether a header (i.e. a .h file) should be generated or not.

This must be one of the following:

  • t, to generate a header.
  • nil, to not generate a header.

A source file (i.e. a .c file is always generated; the :header input only affects the generation of the header.

If ATC is used to generate C code that is not standalone but is meant to be called by external C code, the :header input should be t, so that the external C code can include the header. If ATC is used to generate standalone C code, presumably including a function called main with appropriate types, then :header input should be nil.

:pretty-printing — default nil

Specifies options for how the generated C code is pretty-printed.

This must be a keyword-value list (opt-name1 opt-value1 opt-name2 opt-value2 ...) where each opt-namek is a keyword among the ones described below, and each opt-valuek is one of the allowed values for the corresponding keyword as described below.

The following pretty-printing options are supported:

  • :parenthesize-nested-conditionals t/nil, where t/nil is either t or nil. This option specifies whether a conditional expression that is the `then' or `else' branch of another conditional expression is parenthesized or not. The parentheses are not necessary according to the precedence rules of C, but may help make the code more readable. The default value of this option is nil, which means that no parentheses are added. If this option is t, then parentheses are added. For example, based on whether this option is nil or t, an expression may be printed as either
    a ? b ? c : d : e ? f g
    or
    a ? (b ? c : e) : (e ? f : g)
    .

This is currently the only supported pretty-printing option. More options, to control additional aspects of the pretty-print of the C code, may be added in the future.

:proofs — default t

Specifies whether proofs should be generated or not.

This must be one of the following:

  • t, to generate proofs.
  • nil, to not generate proofs.

While it is obviously recommended to generate proofs, setting this to nil may be useful in case proof generation is (temporarily) broken.

:const-name — default :auto

Name of the generated ACL2 named constant that holds the abstract syntax tree of the generated C program.

This must be one of the following:

  • :auto, to use the symbol *program* in the "C" package.
  • Any other symbol, to use as the name of the constant.

In the rest of this documentation page, let *program* be the symbol specified by this input, if applicable (i.e. when :proofs is t).

:print — default :result

Specifies what is printed on the screen.

It must be one of the following:

  • :error, to print only error output (if any).
  • :result, to print, besides any error output, also the results of ATC. This is the default value of the :print input.
  • :info, to print, besides any error output and the results, also some additional information about the internal operation of ATC.
  • :all, to print, besides any error output, the results, and the additional information, also ACL2's output in response to all the submitted events.

The errors are printed as error output. The results and the additional information are printed as comment output. The ACL2 output enabled by :print :all may consist of output of various kinds.

If :print is :error or :result or :info, ATC suppresses all kinds of outputs (via with-output) except for error and comment output. Otherwise, ATC does not suppress any output. However, the actual output depends on which outputs are enabled or not prior to the call of ATC, including any with-output that may wrap the call of ATC.

Representation of C Code in ACL2

Currently ATC supports the ACL2 representation of a single source file (i.e. a file with extension .c), optionally accompanied by a header (i.e. a file with extension .h), based on the :header input. If :header is nil, the source file consists of one or more C function definitions, zero or more C external object declarations, and zero or more C structure type declarations. If :header is t, the header consists of one or more function declarations, zero or more C external object declarations without initializers, and zero or more C structure type declarations, while the source file consists of one or more function definitions and zero or more C external object declarations with initializers, corresponding to the function declarations and the external object declarations in the header.

Each C structure type declaration is represented by a defstruct, whose name is passed as one of the targets ti to ATC. The symbol name, which is a portable ASCII C identifier (this is enforced by defstruct), represents the tag of the C structure type.

Each C external object declaration is represented by a defobject, whose name is passed as one of the targets ti to ATC. The symbol name, which is a portable ASCII C identifier (this is enforced by defobject), represents the name of the C external object.

Each C function definition (and declaration in the header, if present) is represented by an ACL2 function definition. These are the non-recursive target ACL2 functions ti passed to ATC; the recursive target ACL2 functions ti passed as inputs represent loops in the C functions instead, as explained below.

The order of the C structure types and external objects and functions in the files is the same as the order of the corresponding targets in the list (t1 ... tp) passed to ATC.

Each target function fn must be in logic mode and guard-verified. The function must not occur in its own guard, which is rare but allowed in ACL2.

The symbol name of each non-recursive function target fn must be a portable ASCII C identifier. That symbol name is used as the name of the corresponding C function. Therefore, the non-recursive target functions must have all distinct symbol names; even if they are in different packages, they must have distinct symbol names (the package names are ignored). There is no restriction on the symbol names of the recursive target functions: these represent C loops, not C functions; the names of the recursive target functions are not represented at all in the C code.

The symbol name of each formal parameter of each function target fn, both non-recursive and recursive, must be a portable ASCII C identifier. When fn is non-recursive, the symbol names of its parameters are used as the names of the formal parameters of the corresponding C function, in the same order, except for the formal parameters that represent external objects, as defined below. Therefore, the formal parameters of each fn must have all distinct symbol names; even if they are in different packages, they must have distinct symbol names (the package names are ignored). When fn is recursive, the symbol names of its parameters are used as names of C variables, as explained below.

When fn is recursive, it must have at least one formal parameter. When fn is non-recursive, it may have any number of formal parameters, including none.

If fn is recursive, it must be singly (not mutually) recursive, its well-founded relation must be o<, and its measure must yield a natural number. The latter requirement is checked via an applicability condition, as described in the `Applicability Conditions' section.

The guard of each fn must include, for every formal parameter x, exactly one conjunct of one of the following forms, possibly wrapped with mbt, which determines the C type of the corresponding parameter of the C function, or designates the formal parameter as representing an access to an object defined by defobject:

  • (scharp x), representing signed char.
  • (ucharp x), representing unsigned char.
  • (sshortp x), representing signed short.
  • (ushortp x), representing unsigned short.
  • (sintp x), representing signed int.
  • (uintp x), representing unsigned int.
  • (slongp x), representing signed long.
  • (ulongp x), representing unsigned long.
  • (sllongp x), representing signed long long.
  • (ullongp x), representing unsigned long long.
  • (star (scharp x)), representing signed char *.
  • (star (ucharp x)), representing unsigned char *.
  • (star (sshortp x)), representing signed short *.
  • (star (ushortp x)), representing unsigned short *.
  • (star (sintp x)), representing signed int *.
  • (star (uintp x)), representing unsigned int *.
  • (star (slongp x)), representing signed long *.
  • (star (ulongp x)), representing unsigned long *.
  • (star (sllongp x)), representing signed long long *.
  • (star (ullongp x)), representing unsigned long long *.
  • (schar-arrayp x), representing signed char [].
  • (uchar-arrayp x), representing unsigned char [].
  • (sshort-arrayp x), representing signed short [].
  • (ushort-arrayp x), representing unsigned short [].
  • (sint-arrayp x), representing signed int [].
  • (uint-arrayp x), representing unsigned int [].
  • (slong-arrayp x), representing signed long [].
  • (ulong-arrayp x), representing unsigned long [].
  • (sllong-arrayp x), representing signed long long [].
  • (ullong-arrayp x), representing unsigned long long [].
  • (struct-<tag>-p x), where <tag> is one of the defstruct targets ti, representing the corresponding C structure type, struct <tag>. The structure type must not have a flexible array member. The <tag> target must precede fn in the list of targets (t1 ... tp).
  • (star (struct-<tag>-p x)), where <tag> is one of the defstruct targets ti, representing a pointer type to the corresponding C structure type, struct <tag> *. The <tag> target must precede fn in the list of targets (t1 ... tp).
  • (object-<name>-p x), where <name> is one of the defobject targets ti, representing an access to the external object, which must be an explicit formal parameter in functional ACL2, while the C function accesses it directly. The <name> target must precede fn in the list of targets (t1 ... tp).

The conjuncts may be at any level of nesting, but must be extractable by flattening the and structure of the guard term: only conjuncts extractable this way count for the purpose of determining the C types of the formal parameters and which formal parameters represent accesses to external objects. The rest of the guard (i.e. other than the conjuncts above) is not explicitly represented in the C code.

Note the distinction between pointer types <integer type> * and array types <integer type> [] in the list above. Even though these types are equivalent for function parameters, and in fact array types are adjusted to pointer types for function parameters according to [C17], pointed-to integers and integer arrays are different in the ACL2 representation of C. (More in general, even in handwritten C code, using array types instead of pointer types for function parameters is useful to convey the intention that something is a pointer to (the first or some) element of an integer array as opposed to a pointer to a single integer.)

The aforementioned option of wrapping the conjuncts with mbt is useful for cases in which the function have stronger guards that imply those conjuncts. It would be inelegant to add redundant conjuncts to the guard (presumably via external transformations) for the sole purpose of communicating function parameter types to ATC. By allowing mbt, the redundancy can be explicated and proved (as part of guard verification), while making it easy for ATC to obtain the types.

The return type of the C function corresponding to each non-recursive target function fn is automatically determined from the body, as explained below.

The unnormalized body of each fn must be as follows:

  • If fn is non-recursive, the unnormalized body must be a statement term for fn with loop flag nil returning type T and affecting variables vars (as defined below), where each variable in vars is a formal parameter of fn with pointer or array type and where T is not void if vars is nil. (In general, vars is a subset of the formal parameters of fn with pointer or array type; that is, not all such formal parameters are necessarily affected.) The return type of the C function represented by fn is T.
  • If fn is recursive, the unnormalized body must be a loop term for fn affecting variables vars (as defined below), where each variable in vars is a formal parameter of fn.

The above-mentioned notions of (i) statement term for fn with loop flag L returning T and affecting vars and (ii) loop term for fn affecting vars are defined below, along with the notions of (iii) expression term for fn returning T and affecting vars, (iv) pure expression term for fn returning T, (v) expression term for fn returning boolean, (vi) C type of a variable, and (vii) assignable variable.

A statement term for fn with loop flag L returning T and affecting vars, where fn is a target function, L is either t or nil, T is a C type (possibly void), and vars is a list of distinct symbols, is inductively defined as one of the following:

  • An expression term for fn returning T and affecting vars, when L is nil, T is a non-void non-pointer non-array C type, and vars is nil. That is, an expression term returning a C value is also a statement term returning that C value. This represents a C return statement whose expression is represented by the same term, viewed as an expression term.
  • A term (mv ret var1 ... varn), when ret is an expression term for fn returning T and affecting no variables, L is nil, T is a non-void non-pointer non-array type, and vars is the list (var1 ... varn) with n ≥ 1. This represents a C return statement whose expression is represented by ret; the mv and the variables in vars represent no actual C code: they just represent variables that may have been modified by preceding code.
  • A term var, when L is nil, T is void, and vars is the singleton list (var). This represents no actual C code: it just serves to conclude preceding code that may modify var.
  • A term (mv var1 ... varn), when L is nil, T is void, and vars is the list (var1 ... varn) with n > 1. This represents no actual C code: it just serves to conclude preceding code that may modify var1, ..., varn.
  • A call of fn on variables identical to its formal parameters, when the C types of the variables are the same as the C types of the formal parameters, L is t, T is void, and fn is recursive. This represents no actual C code: it just serves to conclude the computation in the body of the loop represented by fn.
  • A call of if on (i) a test of the form (mbt ...), (ii) a `then' branch that is a statement term for fn with loop flag L returning T and affecting vars, and (iii) an `else' branch that may be any ACL2 term. This represents the same C code represented by the `then' branch. Both the test and the `else' branch are ignored, because ATC generates C code under guard assumptions.
  • A call of if on (i) a test that is an expression term for fn returning boolean and (ii) branches that are statement terms for fn with loop flag L returning T and affecting vars. This represents a C if conditional statement whose test expression is represented by the test term and whose branch blocks are represented by the branch terms.
  • A term (let ((var (declar term))) body), when the symbol name of var is a portable ASCII C identifier, the symbol name of var is distinct from the symbol names of all the other ACL2 variables in scope, term is an expression term for fn returning a non-void non-pointer non-array C type and affecting no variables, and body is a statement term for fn with loop flag L returning T and affecting vars. This represents, as indicated by the wrapper declar, a declaration of a C local variable represented by var, initialized with the C expression represented by term, followed by the C code represented by body. The C type of the variable is determined from the initializer.
  • A term (let ((var (assign term))) body), when var is assignable, term is an expression term for fn returning the same non-void non-pointer non-array C type as the C type of var and affecting no variables, and body is a statement term for fn with loop flag L returning T and affecting vars. This represents, as indicated by the wrapper assign, an assignment to the C local variable or function parameter represented by var, with the C expression represented by term as right-hand side, followed by the C code represented by body.
  • A term (let ((var (<type>-write term))) body), when <type> is among
    • schar
    • uchar
    • sshort
    • ushort
    • sint
    • uint
    • slong
    • ulong
    • sllong
    • ullong
    var is in scope, var has a pointer type whose referenced type is the C integer type corresponding to <type>, var is one of the symbols in vars, term is a pure expression term for fn returning the C integer type corresponding to <type>, body is a statement term for fn with loop flag L returning T and affecting vars. This represents a C assignment to the pointer variable represented by var with the value of the expression represented by term; the wrapper <type>-write signifies the writing to an integer by pointer.
  • A term (let ((var (<type>-array-write var term1 term2))) body), when <type> is among
    • schar
    • uchar
    • sshort
    • ushort
    • sint
    • uint
    • slong
    • ulong
    • sllong
    • ullong
    var is in scope, var has an array type whose element type is the C integer type corresponding to <type>, var is one of the symbols in vars, term1 is a pure expression term for fn returning a C integer type, term2 is a pure expression term for fn returning the C integer type corresponding to <type>, body is a statement term for fn with loop flag L returning T and affecting vars. This represents a C assignment to an element of the array represented by var with the subscript expression represented by term1 with the new element expression represented by term2, followed by the C code represented by body.
  • A term (let ((var (struct-<tag>-write-<member> term var))) body), when <tag> is a defstruct name, <member> is the name of one of the members of that defstruct, <member> has an integer type in the defstruct, var is assignable, var has the C structure type represented by <tag>, term is a pure expression term for fn returning the C integer type of <member>, body is a statement term for fn with loop flag L returning T and affecting vars. This represents a C assignment to a member of the structure represented by var by value (i.e. using .) with the new value expression represented by term, followed by the C code represented by body.
  • A term (let ((var (struct-<tag>-write-<member> term var))) body), when <tag> is a defstruct name, <member> is the name of one of the members of that defstruct, <member> has an integer type in the defstruct, var is in scope, var has a pointer type whose referenced type is the C structure type represented by <tag>, var is one of the symbols in vars, term is a pure expression term for fn returning the C integer type of <member>, body is a statement term for fn with loop flag L returning T and affecting vars. This represents a C assignment to a member of the structure represented by var by pointer (i.e. using ->) with the new value expression represented by term, followed by the C code represented by body.
  • A term (let ((var (struct-<tag>-write-<member>-element term1 term2 var))) body), when <tag> is a defstruct name, <member> is the name of one of the members of that defstruct, <member> has an integer array type in the defstruct, var is assignable, var has the C structure type represented by <tag> or the pointer type to that C structure type, term1 is a pure expression term for fn returning a C integer type, term2 is a pure expression term for fn returning the C type corresponding to <type>, body is a statement term for fn with loop flag L returning T and affecting vars. This represents a C assignment to an element of a member of the structure represented by var by value (i.e. using .) if var has structure type or by pointer (i.e. using -> if var has pointer type, using term1 as the index with the new value expression represented by term2, followed by the C code represented by body.
  • A term (let ((var term)) body), when var is assignable, var is among vars if it is a formal parameter of fn that has pointer or array type if fn is non-recursive, term is a statement term for fn with loop flag nil returning void and affecting var that is either a call of a target function that precedes fn in the list of targets (t1 ... tp) whose body term returns void and affects var or an if whose test is an expression term returning boolean (not a test (mbt ...)), and body is a statement term for fn with loop flag L returning T and affecting vars. This represents the C code represented by term, which may modify the variable represented by var, followed by the C code represented by body.
  • A term (mv-let (var var1 ... varn) (declarn term) body), when the symbol name of var is a portable ASCII C identifier, the symbol name of var is distinct from the symbol names of all the other ACL2 variables in scope, each vari is assignable, each vari is among vars if it is a formal parameter of fn that has pointer or array type if fn is non-recursive, term is an expression term for fn returning a non-void non-pointer non-array C type and affecting the variables (var1 ... varn), and body is a statement term for fn with loop flag L returning T and affecting vars. This represents, as indicated by the wrapper declarn, a declaration of a C local variable represented by var, initialized with the C expression represented by term, followed by the C code represented by body. Note that declarn stands for declar1, declar2, etc., based on whether n is 1, 2, etc.; this n is the number of side-effected variables var1, ..., varn, which is one less than the variables bound by the mv-let. The C type of the variable is determined from the initializer.
  • A term (mv-let (var var1 ... varn) (assignn term) body), when var is assignable, each vari is assignable, each vari is among vars if it is a formal parameter of fn that has pointer or array type if fn is non-recursive, term is an expression term for fn returning the same non-void non-pointer non-array C type as the C type of var and affecting the variables (var1 ... varn), and body is a statement term for fn with loop flag L returning T and affecting vars. This represents, as indicated by the wrapper assignn, an assignment to the C local variable or function parameter represented by var, with the C expression represented by term as right-hand side, followed by the C code represented by body. Note that assignn stands for assign1, assign2, etc., based on whether n is 1, 2, etc.; this n is the number of side-effected variables var1, ..., varn, which is one less than the variables bound by the mv-let.
  • A term (mv-let (var1 ... varn) term body), when n > 1, each vari is assignable, each vari is among vars if it is a formal parameter of fn that has pointer or array type if fn is non-recursive, term is a statement term for fn with loop flag nil returning void and affecting (var1 ... varn) that is either a call of a target function that precedes fn in the list of targets (t1 ... tp) whose body term returns void and affects (var1 ... varn) or an if whose test is an expression term returning boolean (not a test (mbt ...), and body is a statement term for fn with loop flag L returning T and affecting vars. This represents the C code represented by term, which may modify the variables represented by var1, ..., varn, followed by the C code represented by body.
  • A call of a recursive target function fn0 that precedes fn in the list of targets (t1 ... tp), on variables identical to its formal parameters, when the C types of the variables are the same as the C types of the formal parameters of fn0, L is nil, T is void, vars is not nil, and the body of fn0 is a loop term for fn0 affecting vars. This represents the C while statement represented by the body of fn0, as explained below.
  • A call of a non-recursive target function fn0 that precedes fn in the list of targets (t1 ... tp), on pure expression terms for fn returning non-void C types, when the C types of the terms are the same as the C types of the formal parameters, each term of pointer or array type is a variable identical to the corresponding formal parameter of fn0, L is nil, T is void, and the body of fn0 is a statement term for fn0 returning void and affecting vars. This represents an expression statement whose expression is a call of the C function corresponding to fn0.

A loop term for fn affecting vars, where fn is a target function and vars is a non-empty list of distinct symbols, is inductively defined as one of the following:

  • A call of if on (i) a test of the form (mbt ...), (ii) a `then' branch that is a loop term for fn affecting vars, and (iii) an `else' branch that may be any ACL2 term. This represents the same C code represented by the `then' branch. Both the test and the `else' branch are ignored, because ATC generates C code under guard assumptions.
  • A call of if on (i) a test that is an expression term for fn returning boolean, (ii) a `then' branch that is a statement term for fn with loop flag t returning void and affecting vars, and (iii) an `else' branch that is either the variable var when vars is the singleton (var), or the term (mv var1 ... varn) when vars is the list (var1 ... varn) with n > 1. This represents the C while statement whose controlling expression is represented by the test and whose body is represented by the `then' branch; the `else' branch represents no actual C code, because it just serves to complete the if.

An expression term for fn returning T and affecting vars, where fn is a target function, T is a non-void C type, and vars is a list of distinct symbols, is inductively defined as one of the following:

  • A pure expression term for fn returning T, when vars is nil.
  • A call of a non-recursive target function fn0 that precedes fn in the list of targets (t1 ... tp), on pure expression terms for fn returning C types, when the types of the terms are equal to the C types of the formal parameters of fn0, each term of pointer or array type is a variable identical to the corresponding formal parameter of fn0, and the body of fn0 is a statement term for fn0 returning T and affecting vars. This represents a call of the corresponding C function.

A pure expression term for fn returning T, where fn is a target function and T is a non-void C type, is inductively defined as one of the following:

  • A formal parameter of fn, when T is the type of the formal parameter. This represents either the corresponding C formal parameter (as an expression), or the corresponding C external object (as an expression), as explained earlier.
  • A variable introduced by let with declar (as described above), when T is the type of the variable. This represents the corresponding C local variable (as an expression).
  • A call of a function <type>-<base>-const on a quoted integer, when <type> is among
    • sint
    • uint
    • slong
    • ulong
    • sllong
    • ullong
    <base> is among
    • dec
    • oct
    • hex
    T is the C type corresponding to <type> and the quoted integer is non-negative and in the range of T. This represents a C integer constant of the C type indicated by the name of the function, expressed in decimal, octal, or hexadecimal base.
  • A call of a function <op>-<type> on a pure expression term for fn returning U, when <op> is among
    • plus
    • minus
    • bitnot
    • lognot
    <type> is among
    • schar
    • uchar
    • sshort
    • ushort
    • sint
    • uint
    • slong
    • ulong
    • sllong
    • ullong
    T is the C type corresponding to the return type of <op>-<type> and U is the C type corresponding to <type>. This represents the C operator indicated by the name of the function applied to a value of the type indicated by the name of the function. The guard verification requirement ensures that the operator yields a well-defined result. These functions covers all the C unary operators (using the nomenclature in [C17]).
  • A call of a function <op>-<type1>-<type2> on pure expression terms for fn returning U and V, when <op> is among
    • add
    • sub
    • mul
    • div
    • rem
    • shl
    • shr
    • lt
    • gt
    • le
    • ge
    • eq
    • ne
    • bitand
    • bitxor
    • bitior
    <type1> and <type2> are among
    • schar
    • uchar
    • sshort
    • ushort
    • sint
    • uint
    • slong
    • ulong
    • sllong
    • ullong
    T is the C type corresponding to the return type of <op>-<type1>-<type2>, U is the C type corresponding to <type1>, and V is the C type corresponding to <type2>. This represents the C operator indicated by the name of the function applied to values of the types indicated by the name of the function. The guard verification requirement ensures that the operator yields a well-defined result. These functions covers all the C strict pure binary operators; the non-strict operators && and ||, and the non-pure operators =, +=, etc., are represented differently.
  • A call of a function <type1>-from-<type2> on a pure expression term for fn returning U, when <type1> and <type2> are among
    • schar
    • uchar
    • sshort
    • ushort
    • sint
    • uint
    • slong
    • ulong
    • sllong
    • ullong
    and also differ from each other, T is the C type corresponding to <type1> and U is the C type corresponding to <type2>. This represents a cast to the type indicated by the first part of the function name. The guard verification requirement ensures that the conversion yields a well-defined result. Even though conversions happen automatically in certain circumstances in C, these functions always represent explicit casts; implict conversions are represented implicitly, e.g. via the function for a unary operator that promotes the operand.
  • A call of <type>-read on a pure expression term for fn returning U, when <type> is among
    • schar
    • uchar
    • sshort
    • ushort
    • sint
    • uint
    • slong
    • ulong
    • sllong
    • ullong
    T is the C type corresponding to <type>, and U is the pointer type to T. This represents the application of the indirection operator * to the expression represented by the argument of <type>-read.
  • A call of <type>-array-read on pure expression terms for fn returning U and V, when <type> is among
    • schar
    • uchar
    • sshort
    • ushort
    • sint
    • uint
    • slong
    • ulong
    • sllong
    • ullong
    T is the C type correponding to <type>, U is the array type of element type T, and V is a C integer type. This represents an array subscripting expression. The guard verification requirement ensures that the array access is well-defined.
  • A call of struct-<tag>-read-<member> on a pure expression term for fn returning U when <tag> is a defstruct name, <member> is the name of one of the members of that defstruct, <member> has an integer type in the defstruct, T is the C integer type of <member>, and U is the C structure type represented by <tag> or the pointer type to that C structure type. This represents an access to a structure member, by value if U is the C structure type (i.e. using .) or by pointer if U is the pointer type to the C structure type (i.e. using ->).
  • A call of struct-<tag>-read-<member>-element on pure expression terms for fn returning U and V when <tag> is a defstruct name, <member> is the name of one of the members of that defstruct, <member> has an integer array type in the defstruct, T is the C element type of the array type of <member>, U is a C integer type, and V is the C structure type represented by <tag> or the pointer type to that C structure type. This represents an access to an element of a structure member, by value if V is the C structure type (i.e. using .) or by pointer if V is the pointer type to the C structure type (i.e. using ->).
  • A call of sint-from-boolean on an expression term for fn returning boolean that is a call of and or or (not a call of any boolean-from-<type>), when T is int. This converts an expression term returning boolean to a pure expression term returning int. The restriction of the argument term of sint-from-boolean to be a call of and and or is so that such a call always represents the C int 0 or 1, which is what sint-from-boolean returns; the conversion from boolean to int is only in ACL2, not present in the C code, which just has the represented expression.
  • A call of condexpr on a call of if on (i) a test that is an expression term for fn returning boolean and (ii) branches that are pure expression terms for fn returning T, where T is not (signed or unsigned) char or short. This represents a C ?: conditional expression whose test expression is represented by the test term and whose branch expressions are represented by the branch terms. The restriction that T is the same for both branches and that it is not (signed or unsigned) char or short is so that the C ?: operator does not perform conversions on the branches, which would have to be represented explicitly in the ACL2 code that represents the C code.

An expression term for fn returning boolean, where fn is a target function, is inductively defined as one of the following:

  • A call of a function boolean-from-<type> on a pure expression term for fn returning U, when <type> is among
    • schar
    • uchar
    • sshort
    • ushort
    • sint
    • uint
    • slong
    • ulong
    • sllong
    • ullong
    U is the C type corresponding to <type>. This converts a pure expression term returning a C type to an expression term returning boolean.
  • A call of one of the following macros on expression terms for fn returning booleans:
    • and
    • or
    This represents the corresponding C logical operator conjunction && or disjunction ||); conjunctions and disjunctions are represented non-strictly.

The C type of a variable var is defined as follows:

  • If var is a formal parameter that does not represent an access to an external object, the C type is the one derived from the guard as explained earlier.
  • If var is a formal parameter that represents an access to an external object, the C type is derived from the definition of the recognizer object-<name>-p generated by defobject for that object. This may be an integer type or an integer array type; see defobject.
  • If var is not a formal parameter, it must be introduced by a let with declar or an mv-let with declarn, and its C type is the one of the term argument of declar or declarn.

The C type of a variable is never void.

A variable var bound in a let or mv-let in a target function fn is assignable when it is in scope, i.e. it is identical to a function parameter or to a variable bound in an enclosing let or mv-let, and additionally any of the conditions given below holds. The conditions make reference to the C scopes represented by the ACL2 terms that the let or mv-let is part of: there is a C scope for the whole file, then each function body is a nested C scope, and then each if branch whose test is an expression term returning a boolean (i.e. whose test is not mbt is a further nested C scope. The conditions are the following:

  • The function parameter or outer variable is in the same C scope where var occurs.
  • The variable var is among vars, i.e. it is being affected.
  • No variables are being affected, i.e. vars is nil.

Any of these conditions ensures that, in the ACL2 code, the old value of the variable cannot be accessed after the new binding: (i) if the first condition holds, the new binding hides the old variable; (ii) if the second condition holds, it means that the outer binding will receive the value at the end of the changes to the variables; and (iii) if the third condition holds, there is no subsequent code because there is no change to the variables. These justify destructively updating the variable in the C code.

Statement terms represent C statements, while expression terms represent C expressions. The expression terms returning booleans return ACL2 boolean values, while the statement terms, including expression terms returning C values, return ACL2 values that represent C values: the distinction between boolean terms and other kinds of terms stems from the need to represent C's non-strictness in ACL2: C's non-strict constructs are if statements, ?: expressions, && expressions, and || expressions; ACL2's only non-strict construct is if (which the macros and and or expand to). Pure expression terms returning C values represent C expressions without side effects; C function calls may be side-effect-free, but in general we do not consider them pure, so they are represented by expression terms returning C values that are not pure expression terms returning C values. Expression terms returning booleans are always pure; so they do not need the explicit designation `pure' because they are the only expression terms returning booleans handled by ATC. Recursive ACL2 functions represent C loops, where those recursive functions are called. The loop flag L serves to ensure that the body of a loop function ends with a recursive call on all paths (i.e. at all the leaves of its if structure, and that recursive calls of the loop function occur nowhere else.

The body of the C function represented by each non-recursive fn is the compound statement consisting of the block items (i.e. statements and declarations) represented by the ACL2 function's body (which is a statement term).

The restriction that each function fn may only call a function that precedes it in the list of targets (t1 ... tp), means that no (direct or indirect) recursion is allowed in the C code and the target functions must be specified in a topological order of their call graph.

The guard verification requirement ensures that the constraints about C types described above holds, for code reachable under guards. Code unreachable under guards is rare but possible. In order to generate C code that is always statically well-formed, ATC independently checks the constraints about C types.

Translated Terms

The description of the representation of C code in ACL2 above talks about untranslated terms, but ATC operates on translated terms, since it looks at unnormalized bodies of ACL2 functions. This section describes how the untranslated terms mentioned above appear as translated terms: these are the patterns that ATC looks for.

An untranslated term (and a b) is translated to

(if a b 'nil)

An untranslated term (or a b) it translated to

(if a a b)

An untranslated term (mbt x) is translated to

(return-last 'acl2::mbe1-raw 't x)

Since mbt$ expands to mbt, the former can be used too.

An untranslated term (mv var1 ... varn) is translated to

(cons var1 (cons ... (cons varn ' nil)...))

An untranslated term (let ((var (declar term))) body) is translated to

((lambda (var) body) (declar term))

An untranslated term (let ((var (assign term))) body) is translated to

((lambda (var) body) (assign term))

An untranslated term (let ((var term)) body) is translated to

((lambda (var) body) term)

An untranslated term (mv-let (var var1 ... varn) (declarn term) body) is translated to

((lambda (mv)
         ((lambda (var var1 ... varn) body)
          (mv-nth '0 mv)
          (mv-nth '1 mv)
          ...
          (mv-nth 'n mv)))
 ((lambda (mv)
          ((lambda (*0 *1 *2 ... *n)
                   (cons (declar *0) (cons *1 ... (cons *n 'nil))))
           (mv-nth '0 mv)
           (mv-nth '1 mv)
           ...
           (mv-nth 'n mv)))
  term))

An untranslated term (mv-let (var var1 ... varn) (assignn term) body) is translated to

((lambda (mv)
         ((lambda (var var1 ... varn) body)
          (mv-nth '0 mv)
          (mv-nth '1 mv)
          ...
          (mv-nth 'n mv)))
 ((lambda (mv)
          ((lambda (*0 *1 *2 ... *n)
                   (cons (assign *0) (cons *1 ... (cons *n 'nil))))
           (mv-nth '0 mv)
           (mv-nth '1 mv)
           ...
           (mv-nth 'n mv)))
  term))

An untranslated term (mv-let (var1 var2 ... varn) term body) is translated to

((lambda (mv)
         ((lambda (var1 ... varn) body)
          (mv-nth '0 mv)
          (mv-nth '1 mv)
          ...
          (mv-nth 'n-1 mv)))
 term)

Since ATC operates on translated terms, there is no direct restriction on the untranslated bodies of the functions, which in particular may use any macro or any named constant, so long as their translated form satisfies all the requirements stated in this ATC documentation; the restrictions on the translated bodies thus impose indirect restrictions on the untranslated bodies. Note also that ATC treats, for instance, a translated term of the form (if a b 'nil) as if it were the translation of (and a b) in an untranslated function body, even though that untranslated function body may include (if a b 'nil) directly, or some other macro that expands to that: this does not cause any problem, because if two untranslated terms become the same translated term, then they are equivalent for ATC's purposes.

Applicability Conditions

In addition to the requirements on the inputs stated above, the following applicability conditions must be proved. The proofs are attempted when ATC is called. No hints can be supplied to ATC for these applicability conditions; (possibly local) lemmas must be provided before calling ATC that suffice for these applicability conditions to be proved without hints.

:natp-of-measure-of-fn

The measure of the recursive target function fn must yield a natural number:

(natp <fn-measure>)

where <fn-measure> is the measure of fn.

There is one such applicability condition for each recursive target function fn.

These applicability conditions do not need to be proved if :proofs is nil.

Generated Events

Constant

ATC generates an event

(defconst *program* ...)

where ... is the abstract syntax tree of the generated C files, which ATC also pretty-prints and writes to the path(s) specified by the :output-dir and :file-name inputs.

If the :proofs input is nil, this constant is not generated.

Theorems

ATC generates an event

(defthm *program*-well-formed ...)

where ... is an assertion about *program* stating that the generated (abstract syntax tree of the) files is statically well-formed, i.e. it compiles according to [C17].

If the :proofs input is nil, this theorem is not generated.

For each target function fn, ATC generates an event

(defthm *program*-fn-correct ...)

where ... is an assertion about fn and *program* stating that, under the guard of fn, executing the C dynamic semantics on the C function generated from fn yields the same result as the function fn. That is, the C function is functionally equivalent to the ACL2 function.

If the ACL2 function takes arrays or pointers as inputs, the generated correctness theorem includes hypotheses saying that the pointed objects are all at different addresses. The formal model of C that the proofs rely on assumes that objects do not overlap. Thus, the guarantees provided by the generated theorems about the C code hold only if pointers to distinct, non-overlapping objects are passed to the generated C functions.

If the :proofs input is nil, this theorem is not generated.

Generated C Code

ATC generates a single source file, optionally accompanied by a header, as explained in Section `Representation of C Code in ACL2'.

The guard verification requirement stated earlier for each function implies that the generated C operations never exhibit undefined behavior, provided that they are called with values whose ACL2 counterparts satisfy the guard of the function.

Compiling and Running the C Code

The generated C code can be compiled and run as any other C code. For instance, one can use gcc on macOS or Linux.

As mention in the description of the :header input above, if a header is generated, external code should include the header, and the generated source file should be compiled and linked with the external code to obtain a full application. To test the generated code, one can write a separate source file with a main function, and compile all files together. By default, an executable a.out is generated (if using gcc on macOS or Linux).

The community books directory kestrel/c/atc/tests contains some examples of C code generation and handwritten C code to test the generated code.

Redundancy

A call of ATC is redundant if and only if it is identical to a previous successful call of ATC.

Subtopics

Atc-implementation
Implementation of atc.
Atc-tutorial
ATC tutorial.