Mmp-encode-n/c
Definition of n [YP:(193)] and c [YP:(194)].
mmp-encode-n corresponds to n,
while mmp-encode-c corresponds to c.
mmp-encode-u corresponds to the ``local function'' u
used in the third case of the definition of c:
since there are always exactly 16 calls to this function,
we could have avoided introducing mmp-encode-u
and just included 16 calls of mmp-encode-n
in the definition of mmp-encode-c,
but instead we just include 1 call of mmp-encode-u,
which recursively performs the 16 iterations itself.
Introducing mmp-encode-u seems clearer and more elegant
than including 16 calls of mmp-encode-n.
Here \mathfrak{I} is the argument map of all these functions,
i is the argument i of all these functions,
and j in the third case of the definition of c
is the argument nibble-counter of mmp-encode-u.
In [YP], n and c just return byte arrays,
which are the ``roots'' of the MMP trees.
However, their computation relies on the database:
every call of Keccak-256 implicitly adds a pair to the database:
the pair associates the result of the call of Keccak-256 (key)
to the argument of the call of Keccak-256 (value).
This is implicit in [YP],
which mentions `prescience of the byte array' just before [YP:(193)].
In our formalization, we explicate the database,
by having
mmp-encode-n, mmp-encode-c, and mmp-encode-u
all return a database, which they construct.
This returned database contains all and only the pairs
necessary to encode map.
Since we cannot ignore the mathematical possibility of hash collisions,
mmp-encode-n, mmp-encode-c, and mmp-encode-u
all return an error flag that, when equal to :collision,
indicates that a collision in the constructed database took place.
In practice, we do not expect to ever see this flag be :collision,
but the mathematical possibility remains.
Databases constructed by sub-computations are merged,
when no collisions occur (i.e. when the two databases are compatible maps).
If a collision occurs,
nil is returned as both root and database;
these values are irrelevant.
Note that a collision happens if not only a key is already in the database,
but also the associated value in the database differs
from the one that is to be added to the database;
this check is performed in mmp-encode-n,
the only function that directly extends the database
(the other functions merge databases from sub-computations).
If the key is already in the database but the value does not differ,
then there is no collision, and no change to the database;
if we view hashes as implementation-independent pointers
(which is a way to view MMP trees),
this situation just corresponding to two aliased pointers.
Since c involves RLP encoding,
and since RLP encoding may fail (see rlp-encode-tree),
the error flag returned by our functions
is :rlp when an RLP encoding fails.
In this case, nil is returned as both root and database;
these values are irrelevant.
If no hash collisions occur and all RLP encodings succeed,
the error flag returned by our functions is nil,
indicating no error.
The definition of mmp-encode-n follows
the definition of n [YP:(193)] rather closely,
with the additional propagation of errors
from the recursive call of mmp-encode-c,
the additional propagation of the database from mmp-encode-c,
and the additional extension of the database mentioned above.
The definition of mmp-encode-c follows
the definition of c [YP:(194)] also rather closely.
However, since all three cases of the definition of c
apply RLP encoding at the end,
we factor that out of the three cases.
First, mmp-encode-c calculates
an error flag, an RLP tree, and a database according to the three cases.
Then, mmp-encode-c RLP encodes the tree.
If an error occurs prior to RLP encoding,
the ``inner'' computation of mmp-encode-c returns
an RLP tree consisting of an empty byte array and an empty database,
but these values are irrelevant.
Because of this factoring,
and because some tests relevant to termination (see section below)
appear in this factored code,
we need to extend the ACL2::rulers of mmp-encode-c
with :lambdas.
The first case of the definition of c
includes an existential quantifier for I
whose scope does not actually extend
to the use of I in \mathtt{RLP}(\ldots).
In our definition we do not use an existential quantifier:
we test whether the size of the map is 1
and then we extract its only key and value if the test succeeds.
We use mmp-encode-c-max to calculate the variable j
(which is the variable j in mmp-encode-c)
in the second case of the definition of c.
As discussed in mmp-encode-c-max,
the map must be non-empty for this j to be well-defined.
Thus, we add this non-emptiness condition
to the guard of mmp-encode-c.
Note that the scope of the universal quantifier
in the second case of the definition of c
does not actually extend
to the use of I in \mathtt{RLP}(\ldots):
in this expression, I may be any pair in the map,
because by the definition of j we know that
all the keys in the map have the same nibbles
at indices i through j-1.
In our definition, we pick the first key in the map.
Note that the j in the third case of the definition of c
has no relationship to the j
in the second case of the definition of c.
As stated above, the j in the third case
is the argument nibble-counter of mmp-encode-u;
this function corresponds to
the ``local function'' u
in the third case of the definition of c.
The nibble counter argument is initialized to 0
by mmp-encode-c, when it calls mmp-encode-u.
The latter function returns,
essentially (but see below for more precision), u(0),\ldots,u(15),
along with an error flag
and with the database resulting from all the 16 computations.
mmp-encode-u loops through nibble 0 to 15,
extracting submaps and recursively calling mmp-encode-n on them.
Errors are propagated and databases are merged, checking for collisions.
mmp-encode-u does not quite return
the list of roots u(0),\ldots,u(15):
more precisely, it returns a list of RLP leaf trees
containing those byte arrays;
the byte arrays and the RLP leaf trees are isomorphic,
but the calling mmp-encode-c can more easily assemble the trees
into an RLP (branching) tree.
The variable v in the third case of the definition of c
is the value associated to the key of length i in the map,
if it exists, in which case it is unique,
because it is an invariant that all the keys
have the same first i nibbles (see below).
We pick the first key in the map (but any other would do),
we take its first i nibbles,
and we see if the resulting key has an associated value in the map.
Note that we do not need
the existential quantifier in the third case of the definition of c,
whose scope does not extend to
the expression I_1 that defines v anyways.
It is not clear whether ()
in the second case of the definition of v
in the third case of the definition of c
is the empty byte array in \mathbb{B} [YP:(178)]
or the empty tuple in \mathbb{L} [YP:(177)].
The empty byte array is consistent with the fact that
the I_1 in the first case of the definition of v
is also a byte array.
However, in this case maps \mathfrak{I} some of whose keys
have the empty byte array associated as value
cannot be unambiguously encoded as MMP trees,
because we would be unable to distinguish
between the two cases in the definition of v.
This ambiguity does not exist
if instead the () denotes the empty tuple;
note that the first paragraph of [YP:D] mentions
maps between arbitrary-length byte arrays
(suggesting that zero lengths are allowed),
and that in the third case of the definition of c,
the argument tuple of \mathtt{RLP},
whose last component is v,
is well-formed whether v is a byte array or a tuple.
So for now we interpret the () as the empty tuple.
Termination
The termination of
mmp-encode-n, mmp-encode-c, and mmp-encode-u
is proved via a lexicographic measure
that consists of three components.
This termination proof relies on
some invariants satisfied by the arguments of these functions.
These invariants are added as guards,
and tested with mbt at the beginning of the functions.
An invariant that applies to all three functions is that
all the keys in the map have the same first i nibbles:
this is expressed by
mmp-encode-c-max.elementp.
An invariant that applies to mmp-encode-c and mmp-encode-u
is that the map is not empty.
An invariant that applies to mmp-encode-u is that
the map is not a singleton.
mmp-encode-n calls mmp-encode-c
with the same arguments map and i,
and therefore the measure must involve the fact that
mmp-encode-c is ``smaller'' than mmp-encode-c.
Similarly, mmp-encode-c calls mmp-encode-u
with the same arguments map and i,
and therefore the measure must involve the fact that
mmp-encode-u is ``smaller'' than mmp-encode-u.
Thus, we order these functions by assigning 0, 1, and 2 to them:
this is the second component of the lexicographic measure.
Since mmp-encode-u calls mmp-encode-n
and mmp-encode-c calls mmp-encode-n,
the relative ordering of these functions cannot be
the first component of the lexicographic measure.
Instead, the first component involves map and i.
In the calls just mentioned,
the recursion makes progress because i strictly increases:
it becomes either (1+ i) or j.
Before calling mmp-encode-n,
mmp-encode-c checks that j is not i.
The reason why it j is larger than i in this case
is the aforementioned invariant about i and the definition of j:
all the keys in the map have the same i nibbles,
and j is, by definition, the maximum length of the common prefix
of the keys in the map:
thus, j is greater than or equal to i,
and since it is different, it is larger.
The increase of i in the recursive calls is bounded by
nibblelist-bytelist-map-sup-len-key:
the first component of the lexicographic measure is
the difference between that and i.
When mmp-encode-c calls mmp-encode-n,
nibblelist-bytelist-map-sup-len-key remains the same;
since it is greater than or equal to j,
and i is strictly less than j,
the first component of the measure strictly decreases.
When mmp-encode-u calls mmp-encode-n,
nibblelist-bytelist-map-sup-len-key
either stays the same or becomes smaller,
as proved in mmp-encode-u-map,
because we are calling mmp-encode-n on a submap of the map;
to show that the first component of the measure strictly decreases
as i becomes (1+ i),
we use the fact that i is strictly less than
nibblelist-bytelist-map-sup-len-key,
as proved in mmp-encode-c-max,
given that the map has at least two elements,
as asserted by the invariant of mmp-encode-u.
Guard Verification
The guard verification of
mmp-encode-n, mmp-encode-c, and mmp-encode-u
involves proving the preservation, in the recursive calls,
of the invariants used for the termination proof, mentioned above.
This is easy for most recursive calls,
where map and i do not change.
In the call of mmp-encode-n from mmp-encode-c,
map stays the same, while i changes to the maximum j;
since in this case the maximum exists,
it is in the set.
In the call of mmp-encode-n from mmp-encode-u,
the map becomes the submap and i becomes (1+ i).
We start with the fact that
all the keys in the map have the same first i nibbles,
By construction,
the submap has keys with the same nibble at position i.
Therefore, all the keys in the submap has the first (1+ i) nibbles,
and thus (1+ i) is in the set.
Definitions and Theorems
Function: mmp-encode-n
(defun mmp-encode-n (map i)
(declare (xargs :guard (and (nibblelist-bytelist-mapp map)
(natp i))))
(declare (xargs :guard (mmp-encode-c-max.elementp map i)))
(b* (((unless (mbt (mmp-encode-c-max.elementp map i)))
(mv nil nil nil))
((when (or (not (mbt (nibblelist-bytelist-mapp map)))
(omap::emptyp map)))
(mv nil nil nil))
((mv c-error? c-root c-database)
(mmp-encode-c map i))
((when c-error?) (mv c-error? nil nil))
((when (< (len c-root) 32))
(mv nil c-root c-database))
(hash (keccak-256-bytes c-root))
(pair? (omap::assoc hash c-database))
(collisionp (and pair? (not (equal (cdr pair?) c-root))))
((when collisionp)
(mv :collision nil nil))
(database (omap::update hash c-root c-database)))
(mv nil hash database)))
Function: mmp-encode-c
(defun mmp-encode-c (map i)
(declare (xargs :guard (and (nibblelist-bytelist-mapp map)
(natp i))))
(declare (xargs :guard (and (mmp-encode-c-max.elementp map i)
(not (omap::emptyp map)))))
(b*
(((unless (mbt (and (mmp-encode-c-max.elementp map i)
(nibblelist-bytelist-mapp map)
(not (omap::emptyp map)))))
(mv nil nil nil))
((mv error? rlp-tree database)
(b* (((when (= (omap::size map) 1))
(b* (((mv key val) (omap::head map))
(key-rest (nthcdr i key)))
(mv nil
(rlp-tree-branch
(list (rlp-tree-leaf (hp-encode key-rest t))
(rlp-tree-leaf val)))
nil)))
(j (mmp-encode-c-max map))
((when (/= (nfix i) j))
(b* (((mv n-error? n-root n-database)
(mmp-encode-n map j))
((when n-error?)
(mv n-error? (rlp-tree-leaf nil) nil))
((mv any-key &) (omap::head map))
(key-part (nthcdr i (take j any-key))))
(mv nil
(rlp-tree-branch
(list (rlp-tree-leaf (hp-encode key-part nil))
(rlp-tree-leaf n-root)))
n-database)))
((mv u-error? u-trees u-database)
(mmp-encode-u map i 0))
((when u-error?)
(mv u-error? (rlp-tree-leaf nil) nil))
((mv any-key &) (omap::head map))
(key-prefix (take i any-key))
(pair (omap::assoc key-prefix map))
(v (and pair (cdr pair)))
(v-tree (if v (rlp-tree-leaf v)
(rlp-tree-branch nil))))
(mv nil
(rlp-tree-branch (rcons v-tree u-trees))
u-database)))
((when error?) (mv error? nil nil))
((mv rlp-error? rlp-encoding)
(rlp-encode-tree rlp-tree))
((when rlp-error?) (mv :rlp nil nil)))
(mv nil rlp-encoding database)))
Function: mmp-encode-u
(defun mmp-encode-u (map i nibble-counter)
(declare (xargs :guard (and (nibblelist-bytelist-mapp map)
(natp i)
(natp nibble-counter))))
(declare (xargs :guard (and (mmp-encode-c-max.elementp map i)
(not (omap::emptyp map))
(not (equal (omap::size map) 1)))))
(b* (((unless (mbt (and (mmp-encode-c-max.elementp map i)
(nibblelist-bytelist-mapp map)
(not (omap::emptyp map))
(not (equal (omap::size map) 1)))))
(mv nil nil nil))
((when (> (nfix nibble-counter) 15))
(mv nil nil nil))
(submap (mmp-encode-u-map map i nibble-counter))
((mv n-error? n-root n-database)
(mmp-encode-n submap (1+ (nfix i))))
((when n-error?) (mv n-error? nil nil))
((mv u-error? u-trees u-database)
(mmp-encode-u map i (1+ (nfix nibble-counter))))
((when u-error?) (mv u-error? nil nil))
((unless (omap::compatiblep n-database u-database))
(mv :collision nil nil))
(trees (cons (rlp-tree-leaf n-root) u-trees))
(database (omap::update* n-database u-database)))
(mv nil trees database)))
Theorem: return-type-of-mmp-encode-n.error?
(defthm return-type-of-mmp-encode-n.error?
(b* (((mv ?error? ?root ?database)
(mmp-encode-n map i)))
(member-eq error? '(nil :collision :rlp)))
:rule-classes :rewrite)
Theorem: return-type-of-mmp-encode-n.root
(defthm return-type-of-mmp-encode-n.root
(b* (((mv ?error? ?root ?database)
(mmp-encode-n map i)))
(byte-listp root))
:rule-classes :rewrite)
Theorem: return-type-of-mmp-encode-n.database
(defthm return-type-of-mmp-encode-n.database
(b* (((mv ?error? ?root ?database)
(mmp-encode-n map i)))
(databasep database))
:rule-classes :rewrite)
Theorem: return-type-of-mmp-encode-c.error?
(defthm return-type-of-mmp-encode-c.error?
(b* (((mv ?error? ?root ?database)
(mmp-encode-c map i)))
(member-eq error? '(nil :collision :rlp)))
:rule-classes :rewrite)
Theorem: return-type-of-mmp-encode-c.root
(defthm return-type-of-mmp-encode-c.root
(b* (((mv ?error? ?root ?database)
(mmp-encode-c map i)))
(byte-listp root))
:rule-classes :rewrite)
Theorem: return-type-of-mmp-encode-c.database
(defthm return-type-of-mmp-encode-c.database
(b* (((mv ?error? ?root ?database)
(mmp-encode-c map i)))
(databasep database))
:rule-classes :rewrite)
Theorem: return-type-of-mmp-encode-u.error?
(defthm return-type-of-mmp-encode-u.error?
(b* (((mv ?error? ?trees ?database)
(mmp-encode-u map i nibble-counter)))
(member-eq error? '(nil :collision :rlp)))
:rule-classes :rewrite)
Theorem: return-type-of-mmp-encode-u.trees
(defthm return-type-of-mmp-encode-u.trees
(b* (((mv ?error? ?trees ?database)
(mmp-encode-u map i nibble-counter)))
(rlp-tree-listp trees))
:rule-classes :rewrite)
Theorem: return-type-of-mmp-encode-u.database
(defthm return-type-of-mmp-encode-u.database
(b* (((mv ?error? ?trees ?database)
(mmp-encode-u map i nibble-counter)))
(databasep database))
:rule-classes :rewrite)
Theorem: mmp-encode-n-of-nibblelist-bytelist-mfix-map
(defthm mmp-encode-n-of-nibblelist-bytelist-mfix-map
(equal (mmp-encode-n (nibblelist-bytelist-mfix map)
i)
(mmp-encode-n map i)))
Theorem: mmp-encode-n-of-nfix-i
(defthm mmp-encode-n-of-nfix-i
(equal (mmp-encode-n map (nfix i))
(mmp-encode-n map i)))
Theorem: mmp-encode-c-of-nibblelist-bytelist-mfix-map
(defthm mmp-encode-c-of-nibblelist-bytelist-mfix-map
(equal (mmp-encode-c (nibblelist-bytelist-mfix map)
i)
(mmp-encode-c map i)))
Theorem: mmp-encode-c-of-nfix-i
(defthm mmp-encode-c-of-nfix-i
(equal (mmp-encode-c map (nfix i))
(mmp-encode-c map i)))
Theorem: mmp-encode-u-of-nibblelist-bytelist-mfix-map
(defthm mmp-encode-u-of-nibblelist-bytelist-mfix-map
(equal (mmp-encode-u (nibblelist-bytelist-mfix map)
i nibble-counter)
(mmp-encode-u map i nibble-counter)))
Theorem: mmp-encode-u-of-nfix-i
(defthm mmp-encode-u-of-nfix-i
(equal (mmp-encode-u map (nfix i)
nibble-counter)
(mmp-encode-u map i nibble-counter)))
Theorem: mmp-encode-u-of-nfix-nibble-counter
(defthm mmp-encode-u-of-nfix-nibble-counter
(equal (mmp-encode-u map i (nfix nibble-counter))
(mmp-encode-u map i nibble-counter)))
Theorem: mmp-encode-n-nibblelist-bytelist-mequiv-congruence-on-map
(defthm mmp-encode-n-nibblelist-bytelist-mequiv-congruence-on-map
(implies (nibblelist-bytelist-mequiv map map-equiv)
(equal (mmp-encode-n map i)
(mmp-encode-n map-equiv i)))
:rule-classes :congruence)
Theorem: mmp-encode-n-nat-equiv-congruence-on-i
(defthm mmp-encode-n-nat-equiv-congruence-on-i
(implies (acl2::nat-equiv i i-equiv)
(equal (mmp-encode-n map i)
(mmp-encode-n map i-equiv)))
:rule-classes :congruence)
Theorem: mmp-encode-c-nibblelist-bytelist-mequiv-congruence-on-map
(defthm mmp-encode-c-nibblelist-bytelist-mequiv-congruence-on-map
(implies (nibblelist-bytelist-mequiv map map-equiv)
(equal (mmp-encode-c map i)
(mmp-encode-c map-equiv i)))
:rule-classes :congruence)
Theorem: mmp-encode-c-nat-equiv-congruence-on-i
(defthm mmp-encode-c-nat-equiv-congruence-on-i
(implies (acl2::nat-equiv i i-equiv)
(equal (mmp-encode-c map i)
(mmp-encode-c map i-equiv)))
:rule-classes :congruence)
Theorem: mmp-encode-u-nibblelist-bytelist-mequiv-congruence-on-map
(defthm mmp-encode-u-nibblelist-bytelist-mequiv-congruence-on-map
(implies (nibblelist-bytelist-mequiv map map-equiv)
(equal (mmp-encode-u map i nibble-counter)
(mmp-encode-u map-equiv i nibble-counter)))
:rule-classes :congruence)
Theorem: mmp-encode-u-nat-equiv-congruence-on-i
(defthm mmp-encode-u-nat-equiv-congruence-on-i
(implies (acl2::nat-equiv i i-equiv)
(equal (mmp-encode-u map i nibble-counter)
(mmp-encode-u map i-equiv nibble-counter)))
:rule-classes :congruence)
Theorem: mmp-encode-u-nat-equiv-congruence-on-nibble-counter
(defthm mmp-encode-u-nat-equiv-congruence-on-nibble-counter
(implies (acl2::nat-equiv nibble-counter nibble-counter-equiv)
(equal (mmp-encode-u map i nibble-counter)
(mmp-encode-u map i nibble-counter-equiv)))
:rule-classes :congruence)
Subtopics
- Mmp-encode-c
- Mmp-encode-u
- Mmp-encode-n