AleoVM circuits in PFCS form.
See circuits for background.
We formalize and verify circuits for the operations (and sub-operations) in snarkVM. Each circuit is a PFCS predicate definition, whose parameters represent inputs and outputs. The distinction between inputs and outputs is not captured in PFCSes, which are just constraint systems; but it is captured in the specifications for the circuits.
Currently snarkVM does not generate circuits for all operations:
for efficiency, some operations are handled on the fly.
An example is boolean negation:
snarkVM represents a boolean as a linear combinations
(result-linear-combination) (1) = (result-variable)For example, we use
Certain circuits just consist of one instance, e.g. a circuit for field multiplication: there is exactly one. We formalize these as nullary ACL2 functions that return the abstract syntax of a PFCS definition for the circuit; we specify them in concrete syntax, and use our PFCS parser to turn them into abstract syntax (the parser is not formally verified, but it is simple, and the abstract syntax can be inspected if desired; nonetheless, at some point we should also verify the parser). The definition has a name that is unique among all the snarkVM circuits that we define. The variables also have chosen names, whose significance is only local to the definition, because PFCSes support variable instantiation (when a predicate is called on argument expressions).
We use our PFCS lifter to lift the deeply embedded circuit into a shallowly embedded circuit, i.e. an ACL2 predicate; the lifter also generates a theorem connecting the deeply and shallowly embedded forms. The shallowly embedded form is more convenient for verification, because it does away with the indirection between syntax and semantics, i.e. it captures the semantics directly. We prove a correctness theorem for the lifted circuit, in the form of a rewrite rule that rewrites the predicate to some specification, possibly under certain precondition assumptions. The equivalence embodied by the rewrite rule includes soundness (i.e. every solution of the circuit's constraints satisfies the specification) and completeness (i.e. everything satisfying the specification induces a solution to the circuit's constraints). Recall that any free variables in the PFCS definition are existentially quantified internal (i.e. auxiliary) variables; the theorem only concerns the parameters (i.e. the external variables). The specifics of the proof of this theorem depend on the circuit; the proof may range from very simple to very complicated. That theorem is then composed with the lifting theorem (i.e. the one generated by the PFCS lifter), to obtain a correctness theorem about the deeply embedded circuit. This composed theorem could be mechanically generated in the future, e.g. by (a companion tool of) the lifter.
Other circuits consist of families of instances, e.g. a circuit to add two integers of the same size: there is an instance for each possible bit size (perhaps within a range). In this case, we define ACL2 functions that take parameters (e.g. for the bit size) and that generate PFCS abstract syntax that depends on the parameter. That is, we define parameterized PFCS definitions. Then we manually lift them to parameterized ACL2 predicates that represent the same circuit family in shallowly embedded form; our PFCS lifter currently does not handle parameterized circuits, but we are planning to extend it to do so; meanwhile, we do the lifting by hand. Then, analogously to the case of one circuit instance discussed above, we prove a correctness theorem, as a rewrite rule, about the parameterized shallowly embedded circuit, and we compose it with the lifting theorem. Aside from the parameterization, everything is as discussed above for single-instance circuits.
We are also investigating the extension of the PFCS form to include parameterization of circuits, presumably with array variables and with constraints defined via recurrence equations over indices varying over certain ranges. If PFCSes are extended this way, then we could use the same approach for parameterized circuits as the one used for non-parameterized ones: that is, defining an ACL2 nullary function for the parameterized circuit, parsing the parameterized circuit from concrete syntax, lift it to an ACL2 predicate, and so on.
When circuits are hierarchically composed, the correctness theorems of sub-circuits are used to prove the correctness theorems of super-circuits. This keeps proofs modular and scalable: even a circuit consisting (when flattened) of a lot of constraints should have a relatively compact PFCS definition, which calls predicates for smaller circuits for which we have correctness theorems, so that the proof effort for the larger circuit is manageable. The fact that the sub-circuit correctness theorems are rewrite rules is important to facilitate the proofs for super-circuits: the calls of the sub-circuits are rewritten to their specifications, which can be used to show that the super-circuit satisfies its own specification.
An important aspect of PFCSes that enables scalability is their ability to ``hide'' auxiliary variables inside definitions. While these variables need to be taken into account when verifying the circuit of course, they can be ignored wherever the circuit is used in larger circuits. As already noted, the correctness theorem does not involve, in its formulation, the auxiliary variables. In contrast, in R1CS form, everything is visible.
Another key to scalability is the fact that the PFCS circuit definitions have concrete names and variable names. Attempting to formalize and verify R1CSes hierarchically requires the names of the variables to be parameters of the definitions (because R1CSes have no instantiation mechanisms, unlike PFCSes). These include not only the names of the input and output variables, but also of the internal variables. When hierarchically composing R1CSes thus defined, variable names must be chosen for all the sub-circuits, leading to a growing set of parameters, which does not scale well. This problem is described in our papers mentioned in circuits.
The R1CS constraints generated by snarkVM use linear combinations
whose coefficients are always non-negative integers below the prime,
i.e. they are exactly in the field (not modulo reduction).
In particular,
When we define our PFCS circuits, whether parameterized or not, we prove that they are within the structured R1CS subset, i.e. that they consist of constraints that are all in R1CS form, but that are hierarchically organized via predicates. That is, we only take advantage of the hierarchical organization of PFCSes over R1CSes, but we do not (need to) take advantage of the ability to have more general equality constraints between expressions. (The latter generality might turn out to be useful when performing transformations of PFCSes.)
Some of the constraints generated by snarkVM involve a public variable, with index 0, that must be assumed to be 1 in order for things to be correct. There may be some technical of historical reason for this, but this public variable is not really necessary to our purposes: it can be always replaced by 1. For simplicity, our PFCS circuit constructions avoid this variable. When we flatten PFCSes for comparison with the R1CSes generated by snarkVM we will also take into account the fact that the latter may have occurrences of this variable while the former do not (using directly 1 instead): it is easy to prove that the constraints are equivalent under the assumption that that public variable is 1; we will generate such proofs as part of the overall snarkVM circuit verification.
All of this is still work in progress, so things may evolve. Nonetheless, the circuits currently formalized and verified in PFCS form follow the patterns and approach described here.