• Top
    • Documentation
      • Xdoc
        • Undocumented
          • Interp-flags
          • Simpcode
          • Npn4
          • Cutinfo
          • Glcp-config-p
          • Glmc-config-p
          • Defsvtv-args
          • Vl-renaming-alist-p
          • Vcd-pathmap-p
          • Vcd-idxhash-p
          • *atc-exec-binary-strict-pure-rules*
          • Ecut-wirename-alistp
          • Context
          • Incremental-extremize-config-p
          • Block-headerp
          • Ctrex-rule
          • Execution-environmentp
          • Glmc-fsm-p
          • Transactionp
          • Eqbylbp-config-p
          • Proof-obligation
          • Cgraph-edge
          • Wcp-instance-rule-p
          • Fgl-casesplit-config
          • Wcp-witness-rule-p
          • Obligation-hyp
          • Vl-parsestate
          • Ecutnames-p
          • Te-args
          • Fgl-satlink-monolithic-sat-config
          • Fn-info-elt-p
          • Wcp-template-p
          • Truth-idx
          • Polarity4
          • Phase-fsm-params
          • Propiso-info-p
          • Cutscore
          • Vl-rhs
          • Truth6
          • Constraint-tuple
          • Truth5
          • Truth4
          • Truth3
          • Svexl-node
          • Ringosc3
          • Prof-entry-p
          • Constraint-tuple-p
          • Fgl-rune
          • Stv-spec-p
          • Run-snippet-file-info
          • Svex-scc-consts
          • Fsm
          • Machine-statep
          • Xor-signature-p
          • Run-snippet-info
          • Svtv-precompose-data
          • Pipeline-setup
          • Maybe-svar-p
          • Svl-module
          • Maybe-fgl-generic-rule
          • Frames
          • Maybe-snippet-mismatch
          • Cpuid-info-p
          • Vl-warningtree
          • Maybe-proof-obligation
          • Svtv-override-check
          • Maybe-svtv-chase-evaldata
          • Fty-info
          • Prof-entry
          • Substatep
          • Vl-maybe-rhs
          • Vl-parsed-ports
          • Svex-context
          • Account-statep
          • Maybe-simpcode
          • Maybe-rational
          • Maybe-svex
          • Flatten-res
          • Svex-reduce-config
          • Fty-type
          • Rewrite
          • Snippet-info
          • Addnames-indices
          • Svtv*-phase
          • Segment-driver
          • Satlink-parser-state
          • Fgl-binder-rune
          • Congruence-rule
          • Snippet-mismatch
          • Vl-ctxexpr
          • Rsh-of-concat-table
          • Chase-position
          • Hyp-tuple-p
          • Glcp-obj-ctrex-p
          • Glcp-bit-ctrex-p
          • Block$-p
          • Boundrw-subst-p
          • Svtv-cyclephase
          • Svar-split
          • Scopetree
          • Flatnorm-res
          • Fgl-rule
          • Constraint-rule
          • Vl-user-paramsetting
          • Svtv-override-triple
          • Svtv-evaldata
          • Svtv-composedata
          • Width-of-svex-extn
          • Array-fieldinfo-p
          • Wcp-example-app-p
          • Svtv*-input
          • Svex-override-triple
          • Svar-override-triple
          • 3col4vecline
          • Constraint-rule-p
          • Vl-parsed-ports
          • Svtv-chase-evaldata
          • Svtv-assigns-override-config
          • Svex/index
          • Svl-env
          • Svl-aliasdb
          • Inverter
          • Fgl-binder-rule
          • Bvar-db-consistency-error
          • Wcp-lit-actions-p
          • Vcd-multivector-p
          • Svtv-fsm
          • Svexl
          • Constraint-instance
          • Vl-parsestate
          • Partsum-comp
          • Classname/params
          • Vcd-vector-p
          • Tmp-occ
          • Svl-occ
          • Svexllist
          • Svexl-alist
          • Integerp-of-svex-extn
          • Uninterpreted
          • Rw-pair
          • Vl-echar-raw
          • Vl-echar-raw
            • Vl-echar-p
              • Make-vl-echar-fast
              • Vl-echar->char
              • Vl-echar
              • Vl-echar->loc
              • Vl-echarpack-p
            • Vl-echar-fix
            • Vl-echar-raw-equiv
            • Make-vl-echar-raw
            • Vl-echar-raw->filename
            • Vl-echar-raw->pack
            • Change-vl-echar-raw
          • Use-set
          • Svtv-probe
          • Svex-phase-varname
          • Svex-cycle-varname
          • Range
          • Phase-fsm-config
          • Overridekey-syntaxcheck-data
          • Constraint
          • Alias
          • Sig
          • Sandwich
          • Cgraph-derivstate
          • Candidate-assign
          • Svex-alist-eval-equiv!
          • Svex-alist-eval-equiv
          • Scalar-fieldinfo-p
          • Fe-list-listp
          • Fgl-ev-congruence-rulelist-correct-p
          • N-outputs-dom-supergates-sweep-config
          • Svex-envlists-equivalent
          • Svex-alistlist-eval-equiv
          • Flatnorm-setup
          • G-map-tag
          • Obs-sdom-array
          • Dom-supergates-sweep-config
          • Truth4arr
          • Npn4arr
          • Syndef::acid4
          • Svex-envlists-similar
          • Svex-alist-compose-equiv
          • Sym-prod
          • Fgl-congruence-rune
          • Svex-envs-1mask-equiv
          • U32arr
          • Litarr
          • Aigtrans
          • Svexlist-eval-equiv
          • N-outputs-unreachability-config
          • Svex-eval-equiv
          • Unreachability-config
          • Keys-equiv
          • Vl-maybe-exprtype-list-p
          • Svex-alistlist
          • Svar-overridetype-p
          • Vcd-indexlist-p
          • Alternative-spec-listp
          • Variable-listp
          • Vcd-vectorlist-p
          • Vcd-multivectorlist-p
          • Field-spec-listp
          • Neteval-ordering
          • Prof-entrylist-p
          • Hyp-tuplelist-p
          • Glcp-obj-ctrexlist-p
          • Symbol-path-list-p
          • Pseudo-input-listp
          • Input-listp
          • Ecutname-list-p
          • Boundrw-substlist-p
          • Constraintlist
          • Fgl-object-bindings
          • Fgl-generic-rule
          • 4v-equiv
          • Sdm-instruction-table
          • Svexl-node-array
          • Sig-path
          • Func-alist
          • Fgl-generic-rune
          • Fgl-ev-iff-equiv
          • Pseudo-term-subst
          • Nth-lit-equiv
          • Frames-equiv
          • Eval-formula-equiv
          • Svex-s4env
          • Svex-env-keys-equiv
          • Svex-alist-keys-equiv
          • Pseudo-term-alist
          • Fgl-ev-equiv
          • Lits-equiv
          • Nth-nat-equiv
          • Faig-const-equiv
          • Bdd-equiv
          • Byte-list
          • Vl-user-paramsettings
          • Classname/params-unparam-map
          • Obligation-hyp-list
          • Svtv*-output-alist
          • Svtv*-input-alist
          • Svtv-probealist
          • Svtv-override-triplemap
          • Svtv-cyclephaselist
          • Svex/index-maybenat-alist
          • Svex-context-alist
          • Segment-driver-map
          • Rangemap
          • Fnsym-svexlistlist-alist
          • Tmp-occ-alist
          • Svl-module-alist
          • Svexl-node-alist
          • Occ-name-alist
          • Integerp-of-svex-extn-list
          • Alias-alist
          • Special-char-alist
          • Fty-info-alist
          • Bfr-updates
          • Term-equivs
          • Term-bvars
          • Sig-table
          • Fn-indices
          • Fgl-function-mode-alist
          • Ctrex-ruletable
          • Constraint-db
          • Congruence-rule-table
          • Cgraph-derivstates
          • Cgraph-alist
          • Cgraph
          • Casesplit-alist
          • Truthmap
          • Named-lit-list-map
          • Axi-map
          • Nth-equiv
          • Faig-fix-equiv
          • Vl-string/int-alist
          • Vl-reservedtable
          • Vl-echarlist
          • Vl-ctxexprlist
          • Vl-coredatatype-infolist
          • Vl-usertypes
          • Vl-coredatatype-infolist
          • Proof-obligation-list
          • Use-set-summaries
          • Svtv-rev-probealist
          • Svex/index-nat-alist
          • Svex/index-key-alist
          • Svex-key-alist
          • Svex-envlist
          • Svar-widths
          • Svar-width-map
          • Svar-splittab
          • Svar-proplist
          • Svar-key-alist
          • Rsh-of-concat-alist
          • Path-alist
          • Name-alist
          • Address-alist
          • Width-of-svex-extn-list
          • Svl-occ-alist
          • Svex-to-natp-alist
          • Symbol-string-alist
          • Symbol-integer-alist
          • Sym-nat-alist
          • String-string-alist
          • Fty-types
          • Fty-field-alist
          • Any-table
          • Obj-alist
          • Nat-nat-alist
          • Constraint-instancelist
          • Congruence-rulelist
          • Calist
          • Bvar-db-consistency-errorlist
          • Var-counts-alist
          • Pseudo-var-list
          • Equiv-contextslist
          • Nat-val-alistp
          • Id-neg-alist
          • String-keyed-alist-p
          • Snippet-table
          • Snippet-mismatch-list
          • Vl-reportcardkeylist
          • Partsumlist
          • Partsum-elt
          • Classname/paramslist
          • Vl-reportcardkeylist
          • Vl-locationlist
          • Vl-echarlist
          • Perm4-list
          • Svtv*-phaselist
          • Svtv-override-triplemaplist
          • Svtv-override-triplelist
          • Svtv-override-checklist
          • Svtv-name-lhs-map-list
          • Svex/indexlist
          • Svexlistlist
          • Svex-override-triplelist
          • Svex-contextlist
          • Svarlist-list
          • Svar-widthslist
          • Svar-overridetypelist
          • Svar-override-triplelist
          • Segment-driverlist
          • Rangelist
          • Chase-stack
          • Addresslist
          • 4veclistlist
          • 3col4vecs
          • Occ-name-list
          • Alias-lst
          • Word-list
          • Sig-path-list
          • Function-option-name-lst
          • Any-trace
          • Bfr-varnamelist
          • Aig-varlist
          • Scratch-nontagidxlist
          • Prof-entrylist
          • Interp-st-field-p
          • Fgl-runelist
          • Fgl-rulelist
          • Fgl-object-bindingslist
          • Fgl-congruence-runelist
          • Fgl-binder-runelist
          • Fgl-binder-rulelist
          • Ctrex-rulelist
          • Constraint-tuplelist
          • Cgraph-edgelist
          • Candidate-assigns
          • Rw-pairlist
          • Rewritelist
          • Equiv-contexts
          • Bindinglist
          • Pos-list
          • Obs-sdom-info-list
          • Cutinfolist
          • Bit-list
          • Axi-termlist
          • Axi-litlist
          • Symbol-pseudoterm-alist
          • Symbol-pseudoeventform-alist
          • Inst-list
          • Vl-user-paramsettings-mode-p
          • Svtv-data$c-field-p
          • Ctrex-ruletype-p
          • String-stringlist-alist
          • Ipasir-status-p
          • Fgl-toplevel-sat-check-mode-p
          • Axi-op-p
          • Vl-opacity-p
          • St-hyp-method-p
          • Scratchobj-kind-p
          • Logicman-field-p
          • Bvar-db$c-field-p
          • Env$-field-p
          • Axi
          • *smt-architecture*
          • *vl-directions-kwds*
          • *vl-directions-kwd-alist*
          • Rlp-trees
        • Save
        • Defsection
        • Markup
        • Preprocessor
        • Terminal
        • Emacs-links
        • Defxdoc
        • Katex-integration
        • Constructors
        • Entities
        • Defxdoc+
        • Save-rendered
        • Add-resource-directory
        • Testing
        • Order-subtopics
        • Save-rendered-event
        • Archive-matching-topics
        • Archive-xdoc
        • Xdoc-extend
        • Set-default-parents
        • Missing-parents
        • Defpointer
        • Defxdoc-raw
        • Xdoc-tests
        • Xdoc-prepend
        • Defsection-progn
        • Gen-xdoc-for-file
      • ACL2-doc
      • Recursion-and-induction
      • Loop$-primer
      • Operational-semantics
      • Pointers
      • Doc
      • Documentation-copyright
      • Course-materials
      • Args
      • ACL2-doc-summary
      • Finding-documentation
      • Broken-link
      • Doc-terminal-test-2
      • Doc-terminal-test-1
    • Books
    • Boolean-reasoning
    • Projects
    • Debugging
    • Std
    • Proof-automation
    • Macro-libraries
    • ACL2
    • Interfacing-tools
    • Hardware-verification
    • Software-verification
    • Math
    • Testing-utilities
  • Extended-characters
  • Vl-echar-raw

Vl-echar-p

Representation of a single extended character.

Historically, a vl-echar-p was an ordinary aggregate with a character and a location. This was nice and simple, but required a lot of memory. Here is a back-of-the-napkin analysis, where the underlying cons-tree representation of each echar is understood as:

vl-echar ::=  (char . (:vl-location . (filename . (line . col)))))

Assume we need no extra overhead to represent the filename, line, or column. This is fair: typically whole giant sets of echars all have their filename pointing to the same string, so we don't need extra memory for the file name. Furthermore, the line and column number are always fixnums in practice, i.e., they are immediates that don't take any extra space. Then, the memory required for each echar is:

4 conses = 128 bits * 4 = 512 bits = 64 bytes

But since echars generally go in a list, we usually also need 1 extra cons per character to join it to the rest of the list. So, the total overhead just for echars is really more like 80 bytes. In short, to read a file with N bytes in it we need 80N bytes of memory, so if we want to process a 100 MB Verilog file we need 8 GB of space! (It's actually worse than this, because that's just the cost of reading the characters in the first place. After that we have to preprocess them, which is basically an echarlist-to-echarlist transformation. Preprocessing doesn't need to deeply copy the echars themselves, but it is still going to deeply copy the list, which means an extra 16 bytes of overhead per character. So we're up to 9.6 GB for a 100 MB file before reaching a good place where we can garbage collect.

To reduce this overhead, we now use a more efficient encoding scheme.

Encoding Scheme

We will use a simple encoding that allows us to represent almost any practical echar as a single cons of an immediate onto a filename pointer. We will assume we can represent any unsigned 60-bit number as a fixnum, which is true in 64-bit CCL. This seems like plenty of space. We divide it up, rather arbitrarily, as follows:

  • 8-bit character code, so we can represent any character code
  • 30-bit line number, so our maximum line number is ~1 billion
  • 22-bit column number, so our maximum column number is ~4 million

It is hard to imagine hitting these limits in practice, but as a fallback we will simply allow any characters from locations outside this range to be represented as cons structures with line, column, and character code components. This is no worse than our former representation, and means that the interface for constructing echars can be kept simple and bounds-free.

Function: vl-echar-p

(defun vl-echar-p (x)
  (declare (xargs :guard t))
  (let ((__function__ 'vl-echar-p))
    (declare (ignorable __function__))
    (and (std::prod-consp x)
         (b* ((filename (std::prod-car x))
              (pack (std::prod-cdr x)))
           (and (stringp filename)
                (vl-echarpack-p pack))))))

Subtopics

Make-vl-echar-fast
Fast creation of extended characters that bypasses constructing vl-location objects.
Vl-echar->char
High-level accessor: get the character from an vl-echar-p.
Vl-echar
High-level constructor for an vl-echar-p.
Vl-echar->loc
High-level accessor: get the location from an vl-echar-p.
Vl-echarpack-p
Packed up LINE, COL, and CHAR-CODE in a single fixnum.