AST Internal Representations

This section presents the low-level, programmatic capabilities AHEAD offers for creating and manipulating abstract-syntax trees (ASTs) of Java and extended-Java programs. While the discussion centers on Java dialects, the basic technology applies to all AHEAD-produced languages.

AHEAD relies on a hierarchy of different kinds of AST nodes. The kernel hierarchy is shown in the figure below, where AstNode is the root of the AST class hierarchy; all tokens are instances of AstToken.

An AstNode has an array (arg) of zero of more children (of type AstNode), a left and right to connect a node with its siblings, and an up pointer to its parent. Not depicted is an array (tok) of zero or more tokens. The up pointer is null if an AST node has no parent; left is null if there is no left sibling, and similarly for right. A terminal node of an AST has (obviously) zero children; the arg array is null.

In general, programmers need not know about these pointers, although it may be useful information. The code below sketches the definition of the AstNode class (which is defined in the kernel layer).

Bali generates a class called AClass that is instantiated whenever the pattern associated with AClass is recognized. The AST that is produced is shown to the right: an instance of AClass roots an AST that has two children: an instance of NonTerm2 and an instance of NonTerm3 (both objects of which are siblings of each other). The contents of the arg[] array and tok[] array is:

How an AST is printed can be answered in two levels of detail. In the most detailed, we'll see the information that is stored within an AST node for printing. In the more typical interpretation, click here to see the standard idioms to use when programming in AHEAD.

In order to print the AST rooted at the AClass node (or any node for that matter), we need to be able to print each subtree and each of the nodes tokens in the order prescribed by the AClass named rule. This is accomplished by a boolean array order[] which specifies the interweaving of token and subtree printing. In the AClass example above:

That is, order[0] means print the first token (tok[0]), order[1] means print the first subtree (arg[0]), order[2] means print the second token (tok[1]), and so on. The order array is generated by Bali and is otherwise unseen by users.

Note also that tokens store both the text of the token plus the white space (comments, new lines, etc.) that preceded the token during a parse. In this way, it is possible for AHEAD-produced precompilers to parse an input file and unparse it to produce an output file that is Unix-diff equivalent to the input file. (Stated another way, saving the white space allows one to transform programs while retaining comments. This is essential for tools that automatically modify programs which are subsequently hand-edited by programmers).

Below are methods that are useful for tokens. These methods are invoked by the pattern tok[i].method() -- meaning apply method() to token #i:

In general, AHEAD allows no sharing of AST subtrees - i.e., multiple parent nodes point to the same child AST subtree. Now, sharing subtrees is too easy to do in AHEAD metaprogramming:

The above code is not correct if it is to be subsequently manipulated. (If it is merely to be printed, then tree sharing is OK). One fix to the above code would be to clone each AST before linking it in.

Another and simpler is to normalize the tree. The normalize method walks a tree and locates shared subtrees; it clones each shared subtree so that the resulting AST has no tree sharing.

Occasionally it is useful to detach a subtree from an AST. (That is, all parent and sibling linkages are removed). A typical application is that the detached tree is to be serialized to disk for some later retrieval and manipulation. The following idiom illustrates the usage of Detach.

Method	Semantics
String toString()	convert tree rooted at AST node to its string representation
Object clone()	clone tree rooted at AST node (Note: you must cast the result to the appropriate type before use)
AstNode hasAncestor(String classname)	progress up the AST looking for a parent node that is an instance of classname (or an instance of some derived class)
void PrettyDump()	dump the contents of an AST (useful for debugging)
void dumpnode()	useful for debugging small ASTs
AstNode Replace(AstNode withnode)	Replace(k) swaps "this" with node k. Replace returns k as a result.
void print()	unparses the contents of the AST rooted at "this" node
void print(AstProperties props)	unparses "this" tree and outputs the result to file specified in props
reduce2java(AstProperties props)	prints the Java code that is defined by the Jakarta AST that is rooted at "this" node
boolean Equ( AstNode x )	return true if tree rooted by x is equal to "this"
void Delete()	delete the current node; node must be an element of some list structure
void add( AstList tree )	adds tree after the current node. (Note: "this" must be an AstList).
void add( AstList tree, String separator)	just like add (above), but the separator is also specified. Useful for comma-separated lists.
void AddAfter(AstList tree)	add tree after "this" node; "this" node must be an element of some list structure
void AddBefore(AstList tree)	add tree before "this" node; "this" node must be an element of some list structure
AstNode addComment(String comment)	is method appends the comment given by the parameter to the first AstToken
AstNode addComment(String comment, boolean replace)	optionally add or replace comment in front of first token
void appendComment(String x)	add comment x to the specified tree after the first white space; tree must have a token in it
void prependComment(String x)	add comment x to the specified tree before the first white space; tree must have a token in it
AstNode normalizeTree( )	This method is used to repair ASTs that share subtrees which might have been created during sloppy metaprogramming. The idea is to walk the tree looking for nodes that don't point to their parent. In such cases, the subtree is cloned and the result is that normalized trees don't have shared subtrees
void Detach()	Detach isolates a subtree from its left, up, and right node relations. Useful for disconnecting a tree from an existing tree
void writeTree( String fileName )	serializes an AST to a file (filename.ser); make sure AST is detached before calling writeTree()

void setTokenName(String x)	set the token string to x
String getTokenName()	returns string of token #i
int linenum()	returns line number on which the token appeared
void print()	prints the token and whitespace
void reduce2java(AstProperties p)	reduces to Java (prints token)
void reduce2ast(AstProperties p)	generate code that will create this token

Internal Representation of AHEAD Abstract Syntax Trees/Parse Trees

AST Node Hierarchy

Structure of an AST Node

Relationship of AST Nodes to Bali Productions

How is an AST printed?

Print Information Stored in AST Nodes

Normalizing Trees

Detaching Trees

Methods on AST Nodes