• Top
    • Documentation
    • Books
    • Boolean-reasoning
    • Projects
    • Debugging
    • Std
    • Proof-automation
    • Macro-libraries
    • ACL2
    • Interfacing-tools
    • Hardware-verification
    • Software-verification
      • Kestrel-books
        • Crypto-hdwallet
        • Apt
        • Error-checking
        • Fty-extensions
        • Isar
        • Kestrel-utilities
        • Set
        • Soft
        • C
        • Bv
        • Imp-language
        • Event-macros
        • Java
        • Bitcoin
        • Ethereum
        • Yul
          • Transformations
          • Language
            • Abstract-syntax
            • Dynamic-semantics
            • Concrete-syntax
              • Lexer
              • Parser
              • Grammar-old
              • Grammar
              • Tokenizer
            • Static-soundness
            • Static-semantics
            • Errors
          • Yul-json
        • Zcash
        • ACL2-programming-language
        • Prime-fields
        • Json
        • Syntheto
        • File-io-light
        • Cryptography
        • Number-theory
        • Lists-light
        • Axe
        • Builtins
        • Solidity
        • Helpers
        • Htclient
        • Typed-lists-light
        • Arithmetic-light
      • X86isa
      • Axe
      • Execloader
    • Math
    • Testing-utilities
  • Language

Concrete-syntax

Concrete syntax of Yul.

The concrete syntax is defined by an ABNF grammar based on the grammar in [Yul]. We parse the ABNF grammar into an ACL2 representation.

A complete specification of the concrete syntax of Yul would need to complement the grammar with predicates that define and relate the two levels of lexical and syntactic sub-grammars, and also restrict certain grammar rules. We plan to do that in the future.

We also provide a parser of Yul, based on our ABNF grammar. The parser is currently not verified, but it follows the grammar closely. The primary API for parsing Yul is parse-yul and parse-yul-bytes. This parser can be regarded as providing a low-level version of the predicates over the grammar mentioned in the previous paragraph.

There are currently two published grammars of Yul: one is in [Yul: Specification of Yul]; the other is part of the Solidity grammar in [Solidity: Language Grammar]. The Yul team has told us that the former is older than the latter, and that the plan is to have a separate Yul grammar along the lines of the one that is part of the Solidity grammar. For now, we formalize both grammars in ABNF, and we parse both grammars into ACL2. (The reason is somewhat accidental: we initially formalized and parsed the older grammar, after which we were told that the other grammar is newer; but since we have the old one formalized and parsed already, we like to keep it for now, along with the new one.

The old and new grammar have the following differences:

  • The old grammar allows dots in identifiers, while the new grammar does not. However, the new grammar introduces a notion of path, which is a sequence of one or more dot-separated identifiers; paths can be used as expressions and can be assigned to.
  • The old grammar includes type names, defined as identifiers, while the new grammar does not have that notion. In particular, no optional types are allowed by the new grammar for literals, declared variables, and function inputs and outputs.
  • The old grammar allows any expression as statement, while the new grammar allows only function calls.
  • The old grammar allows any expression to initialize multiple variables or to assign to multiple variables, while the new grammar allows only function calls.
  • The old grammar allows leading zeros in decimal numbers, while the new grammar disallows them.
  • The old and new grammar have somewhat different definitions of string literals. In particular, the new grammar clarifies the underlying character set, which was implicit in the old grammar due to the use of a complement and a wildcard. The old grammar only allows surrounding double quotes, while the new grammar also allows surrounding single quotes.
  • The new grammar adds hex strings to the possible literals.

Since the new grammar is newer, the rest of our Yul formalization and tools is based on it. We keep the old grammar around just for historical reasons, but we use the unqualified `grammar' to denote the new grammar, except in a few cases when talking about both new and old grammar. We qualify the old grammar with `old'.

Subtopics

Lexer
An executable lexer of Yul.
Parser
An executable parser of Yul.
Grammar-old
Old ABNF grammar of Yul.
Grammar
ABNF grammar of Yul.
Tokenizer
An executable tokenizer of Yul.