• Top
    • Documentation
    • Books
    • Boolean-reasoning
    • Projects
    • Debugging
    • Std
    • Proof-automation
    • Macro-libraries
    • ACL2
    • Interfacing-tools
    • Hardware-verification
    • Software-verification
      • Kestrel-books
        • Crypto-hdwallet
        • Apt
        • Error-checking
        • Fty-extensions
        • Isar
        • Kestrel-utilities
        • Set
        • Soft
        • C
          • Syntax-for-tools
            • Disambiguator
              • Dimb-exprs/decls/stmts
              • Dimb-make/adjust-expr-binary
              • Dimb-transunit
              • Dimb-params-to-names
              • Dimb-cast/call-to-call
              • Dimb-make/adjust-expr-cast
              • Dimb-fundef
              • Dimb-make/adjust-expr-unary
              • Dimb-expr
              • Dimb-dirdeclor
              • Dimb-amb-declor/absdeclor
              • Dimb-cast/call-to-cast
              • Dimb-cast/addsub-to-cast
              • Dimb-cast/addsub-to-addsub
              • Dimb-transunit-ensemble
              • Dimb-add-ident
              • Dimb-kind
              • Dimb-declor
              • Dimb-amb-expr/tyname
              • Dimb-cast/mul-to-cast
              • Dimb-cast/and-to-cast
              • Dimb-cast/mul-to-mul
              • Dimb-cast/and-to-and
              • Dimb-kind-option
              • Dimb-type-spec
              • Dimb-extdecl-list
              • Dimb-extdecl
              • Dimb-lookup-ident
              • Dimb-param-declor
              • Dimb-decl-spec
              • Dimb-add-idents-objfun
              • Dimb-param-declon
              • Dimb-add-ident-objfun
              • Dimb-table
              • Dimb-amb-decl/stmt
              • Dimb-pop-scope
              • Dimb-push-scope
              • Dimb-initdeclor
              • Dimb-declor-option
              • Dimb-enumspec
              • Dimb-decl
              • Dimb-stmt
              • Dimb-scope
              • Dimb-structdeclor
              • Dimb-initdeclor-list
              • Dimb-decl-spec-list
              • Dimb-absdeclor
              • Dimb-init-table
              • Dimb-dirabsdeclor
              • Dimb-align-spec
              • Dimb-strunispec
              • Irr-dimb-table
              • Irr-dimb-kind
              • Dimb-spec/qual-list
              • Dimb-spec/qual
              • Dimb-param-declon-list
              • Dimb-label
              • Dimb-enumer-list
              • Dimb-enumer
              • Dimb-dirabsdeclor-option
              • Dimb-block-item
              • Dimb-structdeclor-list
              • Dimb-structdecl-list
              • Dimb-statassert
              • Dimb-desiniter-list
              • Dimb-desiniter
              • Dimb-decl-list
              • Dimb-const-expr-option
              • Dimb-absdeclor-option
              • Dimb-structdecl
              • Dimb-member-designor
              • Dimb-initer-option
              • Dimb-initer
              • Dimb-genassoc-list
              • Dimb-genassoc
              • Dimb-expr-option
              • Dimb-expr-list
              • Dimb-designor-list
              • Dimb-designor
              • Dimb-const-expr
              • Dimb-block-item-list
              • Dimb-tyname
            • Abstract-syntax
            • Parser
            • Validator
            • Printer
            • Formalized-subset
            • Mapping-to-language-definition
            • Input-files
            • Defpred
            • Output-files
            • Abstract-syntax-operations
            • Validation-information
            • Implementation-environments
            • Concrete-syntax
            • Unambiguity
            • Ascii-identifiers
            • Preprocessing
            • Abstraction-mapping
          • Atc
          • Language
          • Representation
          • Transformation-tools
          • Insertion-sort
          • Pack
        • Bv
        • Imp-language
        • Event-macros
        • Java
        • Bitcoin
        • Ethereum
        • Yul
        • Zcash
        • ACL2-programming-language
        • Prime-fields
        • Json
        • Syntheto
        • File-io-light
        • Cryptography
        • Number-theory
        • Lists-light
        • Axe
        • Builtins
        • Solidity
        • Helpers
        • Htclient
        • Typed-lists-light
        • Arithmetic-light
      • X86isa
      • Axe
      • Execloader
    • Math
    • Testing-utilities
  • Syntax-for-tools

Disambiguator

Disambiguator of the C abstract syntax for tools.

As discussed in the abstract syntax, the syntax of C is inherently syntactically ambiguous, in ways that can be disambiguated only via a (static) semantic analysis. Here we define such semantic analysis, which we call `disambiguator'. The disambiguator does not check the full static validity of the code, but it performs sufficient checks to disambiguate the abstract syntax trees generated by the parser. The disambiguator transforms the abstract syntax trees, generating unambiguous ones.

The abstract syntax trees generated by the parser contains the following ambiguities, which the disambiguator resolves as indicated:

  • Identifiers used as expressions are always classified as identifier expressions (i.e. the :ident case of expr) by the parser, but some of them may be enumeration constants. The disambiguator re-classifies the latter as required.
  • Some constructs include ambiguous expressions or type names, represented by the type amb-expr/tyname. The disambiguator turns those constructs into unambiguous ones, by choosing either an expression or a type name.
  • Some constructs include ambiguous declarators or abstract declarators, represented by the type amb-declor/absdeclor. The disambiguator turns those constructs into unambiguous ones, by choosing either a declarator or an abstract declarator.
  • Some constructs include ambiguous declarations or statements, represented by the type amb-decl/stmt. The disambiguator turns those constructs into unambiguous ones, by choosing either a declaration or a statement.
  • Function declarators whose parameters are all identifiers are always classified as parameter type lists by the parser, but some of them may be identifier lists. The disambiguator re-classifies the latter as required.
  • Some expressions may be cast expressions or binary expressions, represented by the :cast/...-ambig cases of expr. The disambiguator turns these into unambiguous case or binary expressions.
  • The initializing part of a for loop may be ambiguously a declaration or an expression followed by a semicolon, represented by the :for-ambig case of stmt. The disambiguator turns these ambiguous for loops into unambiguous ones.

The disambiguator does not perform a full (static) semantic analysis, but only a light-weight one, enough for disambiguation. If the code is statically valid, the disambiguation must work. If the code is not statically valid, the disambiguation may fail; in some cases, the disambiguator reports the cause of invalidity, but in other cases it may not have enough information.

The disambiguation is performed by scanning the code, transforming it as applicable (re-classifying or choosing constructs). This needs certain information about the identifiers in the code, which we build and use. For instance, when we encounter an enumeration specifier, we need to record the information that the identifiers of the enumerators are enumeration constants, so that, if we later encounter an expression with that identifier, we can re-classify from an identifier expression to an enumeration constant (this is the first of the ambiguities listed above). In essence, we need a symbol table of identifiers; not a full one that would be needed to check the full validity of the code, but one with sufficient information to disambiguate. We need to take into account the scoping rules of C of course, since the same identifier may have different meaning in different scopes. We call these symbol tables `disambiguation tables'.

We use error-value tuples to handle errors in the disambiguator.

The fixtype and functions in the implementation of the disambiguator are prefixed by dimb, which stands for `DIsaMBiguator'.

For now we leave the GCC extensions unchanged, i.e. we do not apply the disambiguation to them, even though they may contain expression. Some initial experiments reveal that we will need to treat these GCC extensions in a more dedicated way. For instance, the access attribute may include the arguments read_only, write_only, and read_write. Grammatically these are expressions, but it seems that they have just a special meaning, like keywords, in the context of the access attribute, but they are likely not keywords elsewhere. A naive treatment would attempt to resolve those arguments, which are not in the disambiguation table.

Subtopics

Dimb-exprs/decls/stmts
Disambiguate expressions, declarations, statements, and related artifacts.
Dimb-make/adjust-expr-binary
Build, and adjust if needed, a binary expression.
Dimb-transunit
Disambiguate a translation unit.
Dimb-params-to-names
Disambiguate a list of parameter declarations to a list of names, if appropriate.
Dimb-cast/call-to-call
Disambiguate an ambiguous cast or call expression to be a call expression.
Dimb-make/adjust-expr-cast
Build, and adjust if needed, a cast expression.
Dimb-fundef
Disambiguate a function definition.
Dimb-make/adjust-expr-unary
Build, and adjust if needed, a unary expression.
Dimb-expr
Disambiguate an expression.
Dimb-dirdeclor
Disambiguate a direct declarator.
Dimb-amb-declor/absdeclor
Disambiguate an ambiguous declarator or abstract declarator.
Dimb-cast/call-to-cast
Disambiguate an ambiguous cast or call expression to be a cast expression.
Dimb-cast/addsub-to-cast
Disambiguate an ambiguous cast or addition/subtraction expression to be a cast expression.
Dimb-cast/addsub-to-addsub
Disambiguate an ambiguous cast or addition/subtraction expression to be an addition or subtraction expression.
Dimb-transunit-ensemble
Disambiguate a translation unit ensembles.
Dimb-add-ident
Add an identifier and its kind to the disambiguation table.
Dimb-kind
Fixtype of kinds of identifiers in disambiguation tables.
Dimb-declor
Disambiguate a declarator.
Dimb-amb-expr/tyname
Disambiguate an ambiguous expression or type name.
Dimb-cast/mul-to-cast
Disambiguate an ambiguous cast or multiplication expression to be a cast expression.
Dimb-cast/and-to-cast
Disambiguate an ambiguous cast or conjunction expression to be a cast expression.
Dimb-cast/mul-to-mul
Disambiguate an ambiguous cast or multiplication expression to be a multiplication expression.
Dimb-cast/and-to-and
Disambiguate an ambiguous cast or conjunction expression to be a conjunction expression.
Dimb-kind-option
Fixtype of optional kinds of identifiers in disambiguation tables.
Dimb-type-spec
Disambiguate a type specifier.
Dimb-extdecl-list
Disambiguate a list of external declarations.
Dimb-extdecl
Disambiguate an external declaration.
Dimb-lookup-ident
Look up an identifier in the disambiguation table.
Dimb-param-declor
Disambiguate a parameter declarator.
Dimb-decl-spec
Disambiguate a declaration specifier.
Dimb-add-idents-objfun
Add all the identifiers in a list to the disambiguation table, with object or function kind.
Dimb-param-declon
Disambiguate a parameter declaration.
Dimb-add-ident-objfun
Add an identifier to the disambiguation table, with object or function kind.
Dimb-table
Fixtype of disambiguation tables.
Dimb-amb-decl/stmt
Disambiguate an ambiguous declaration or statement.
Dimb-pop-scope
Pop a scope from the disambiguation table.
Dimb-push-scope
Push a scope into the disambiguation table.
Dimb-initdeclor
Disambiguate an initializer declarator.
Dimb-declor-option
Disambiguate an optional declarator.
Dimb-enumspec
Disambiguate an enumeration specifier.
Dimb-decl
Disambiguate a declaration.
Dimb-stmt
Disambiguate a statement.
Dimb-scope
Fixtype of scopes in disambiguation tables.
Dimb-structdeclor
Disambiguate a structure declarator.
Dimb-initdeclor-list
Disambiguate a list of initializer declarators.
Dimb-decl-spec-list
Disambiguate a list of declaration specifiers.
Dimb-absdeclor
Disambiguate an abstract declarator.
Dimb-init-table
Initial disambiguation table.
Dimb-dirabsdeclor
Disambiguate a direct abstract declarator.
Dimb-align-spec
Disambiguate an alignment specifier.
Dimb-strunispec
Disambiguate a structure or union specifier.
Irr-dimb-table
An irrelevant disambiguation table.
Irr-dimb-kind
An irrelevant kind of identifiers in disambiguation tables.
Dimb-spec/qual-list
Disambiguate a list of specifiers and qualifiers.
Dimb-spec/qual
Disambiguate a specifier or qualifier.
Dimb-param-declon-list
Disambiguate a list of parameter declarations.
Dimb-label
Disambiguate a label.
Dimb-enumer-list
Disambiguate a list of enumerators.
Dimb-enumer
Disambiguate an enumerator.
Dimb-dirabsdeclor-option
Disambiguate an optional direct abstract declarator.
Dimb-block-item
Disambiguate a block item.
Dimb-structdeclor-list
Disambiguate a list of structure declarators.
Dimb-structdecl-list
Disambiguate a list of structure declarations.
Dimb-statassert
Disambiguate a static assertion declaration.
Dimb-desiniter-list
Disambiguate a list of initializers with optional designations.
Dimb-desiniter
Disambiguate an initializer with optional designations.
Dimb-decl-list
Disambiguate a list of declarations.
Dimb-const-expr-option
Disambiguate an optional constant expression.
Dimb-absdeclor-option
Disambiguate an optional abstract declarator.
Dimb-structdecl
Disambiguate a structure declaration.
Dimb-member-designor
Disambiguate a member designator.
Dimb-initer-option
Disambiguate an optional initializer.
Dimb-initer
Disambiguate an initializer.
Dimb-genassoc-list
Disambiguate a list of generic associations.
Dimb-genassoc
Disambiguate a generic association.
Dimb-expr-option
Disambiguate an optional expression.
Dimb-expr-list
Disambiguate a list of expressions.
Dimb-designor-list
Disambiguate a list of designators.
Dimb-designor
Disambiguate a designator.
Dimb-const-expr
Disambiguate a constant expression.
Dimb-block-item-list
Disambiguate a list of block items.
Dimb-tyname
Disambiguate a type name.