CS 394P: Pattern Matcher Documentation

The pattern matcher in the files patmatch.lsp and patm.lsp transforms an input expression repeatedly using patterns until a fixed-point is reached (i.e., no further transforms can be applied).

Transformation is accomplished by calling the function trans:

   (trans input 'kind)
where input is the input expression and kind is the kind of patterns to be used. Having several kinds of patterns allows patterns to be used for different purposes, e.g. code expansion, optimization, and language translation. The result of trans is the transformed expression.

Patterns are defined by calling (redefpatterns kind transforms):

   (redefpatterns 'opt
     '( ((+ 0 ?x)      ?x)        ...      ))

Each transform is a list of an input pattern and an output pattern, with optional extra elements. Variables are denoted by symbols that begin with a question mark, e.g. ?x .

Dot notation may be used to match the remainder of a list, e.g. the transform:

   ((progn nil . ?s)     (progn . ?s))
will get rid of a nil at the beginning of a progn form; the variable ?s will be bound to the remainder of the forms following the nil, i.e. a list of zero or more elements.

Partial evaluation is performed on many subexpressions whose operands are constant. For example, the transform:

   ((+ ?n (+ ?m ?x))     (+ (+ ?n ?m) ?x))
where ?n and ?m are numeric constants, rearranges its input to group the two constants together, which will cause them to be folded into a single constant.

In some cases, it is necessary to make some tests before deciding whether to apply a transformation. The optional third element of a pattern is a test predicate; the pattern is only applied if the test is satisfied:

   ((+ ?n (+ ?m ?x))   (+ (+ ?n ?m) ?x))   (and (numberp ?n) (numberp ?m)) )

Some transformations require computations to be done on input variables. This is done by defining new variables (not used in the input pattern) as computed values derived from input variables. These computations are specified in a list as the fourth element of a transform:

   ( (?newvar value) ...)
where ?newvar is a new variable that receives the computed value and value is a computation in terms of input variables.

For example, to combine nested progn's containing arbitrary numbers of statements, the following pattern could be used:

   ((progn (progn . ?u) . ?v)   (progn . ?w)   t
      ((?w (append ?u ?v))))
In this case, the new variable ?w is created by appending the statements in the input variables ?u and ?v. The t in the pattern is a default test as a placeholder.

A second example shows the use of substitutions in a computation:

   ((intersection (subset (function (lambda (?x) ?p)) ?s)
                  (subset (function (lambda (?y) ?q)) ?s))
    (subset (function (lambda (?x) (and ?p ?qq))) ?s)    t
    ((?qq (subst ?x ?y ?q))))
     ; where ?qq is ?q with ?x substituted for ?y

Sometimes there is a need to introduce new variables in a pattern. This can be done by using (gentemp) to generate a new symbol and bind it to a new variable.

It is legitimate to introduce new "function names" that serve only to trigger patterns and that eventually are removed from the output by being transformed into something else. The "function" fn-args is used in this way in the lisptoc patterns.

There are several conventions used in the lisptoc patterns:

  1. Some patterns begin with an empty string, "". This prevents the output of these patterns from matching any other pattern.
  2. The form #\Return causes the printing program cpr to insert a newline and then to tab over to the current tab stop.
  3. The form #\Tab causes the tab stop to be increased by 2 columns; this will be in effect for subsequent occurrences of #\Return within the list in which the #\Tab appears.

Files:

patmatch.lsp    Simple matcher, as in lecture slides
patm.lsp        More powerful matcher features
pats.lsp        Starter set of patterns
patex.lsp       Examples for use in optimization exercise