CpS 450 Language Translation Systems

ANTLR Parse Tree Processing

Parse Tree Structure

ANTLR parse trees consist of nodes that implement the ParseTree interface and subinterfaces:

ParseTree
  ^-- RuleNode
  |-- TerminalNode 
  • A RuleNode represents a nonterminal
  • A TerminalNode represents a terminal

Terminal Nodes

A TerminalNode represents a token.

  • Use its getSymbol() method to obtain the ANTLR Token, so that you can get the line number, or the type of the token.
  • Use its getText() method to get the text of the token.

Nonterminal Nodes

For each nonterminal in the grammar, ANTLR generates a class named nonterminal-nameContext. The class contains methods to retrieve the children of that nonterminal.

Example: Consider TinyParser.Assign_stmtContext

Enhancing the ANTLR Grammar

You can add annotations to an ANTLR grammar to make the generated nodes more convenient to work with.

Naming Alternatives

Consider the following fragment:

term
   : ID            
   | integer       
   | '(' expr ')'  
   ;

Consider adding names to the alternatives:

term
   : ID            # IdTerm
   | integer       # IntTerm
   | '(' expr ')'  # ParTerm
   ;

ANTLR will generate classes named IDTermContext, IntTermContext, and ParTermContext - one for each alternative. This will make it easier to determine which alternative was used at a given point in the parse tree.

Naming Symbols

expr
   : expr mul_op expr  
   ...

The fragment above contains two expr nonterminals on the right hand side. You can access them using the generated methods:

List<ExprContext> expr()
ExprContest expr(int i)

Or, you can name them:

expr
   : e1=expr mul_op e2=expr  # MulExpr
   ...

Now, you can access them using the names e1 and e2.

Generating Lists

Consider the following rule:

id_list
   : ID (',' ID)*
   ;

You can make ANTLR generate a single List of ID nodes by enhancing the rule as follows:

id_list
   : ids+=ID (',' ids+=ID)*
   ;

ANTLR will add an instance variable to the generated context class that makes it convenient to iterate over all of the ID tokens:

List<Token> ids;

Decorating Parse Trees

As a compiler traverses a parse tree during semantic checking, it often must decorate the parse tree by adding information to the nodes of the tree. You can get ANTLR to add instance variables to parse tree nodes by adding the returns clause to nonterminal definitions in the grammar:

expr returns [Double foo, String bar]
   : e1=expr mul_op e2=expr  # MulExpr
   | e1=expr add_op e2=expr  # AddExpr
   | term                    # TermExpr
   ;

The returns clause specifies attributes to be added to parse tree nodes. For example, the above causes ANTLR to generate an ExprContext class as follows:

	public static class ExprContext extends ParserRuleContext {
		public Double foo;
		public String bar;
    ...

Traversing Parse Trees

Depth First Traversal

  • A depth first traversal algorithm is appropriate for both semantic analysis and code generation
  • See examples/interpreter/…/TinyInterpreter.java