Recovering line and column numbers in your Antlr AST

I had the need today to recover the line and column numbers from the original source file in an Antlr AST node. It turns out that by default the AST node used is antlr.CommonAST, which extends from BaseAST. The BaseAST implementation is less than helpful:

public int getLine() {
        return 0;
    }

    public int getColumn() {
        return 0;
    }

If you poke around on the newsgroups you can find some breadcrumbs leading you towards the solution, but the the complete changes are never quite altogether, so here they are.

First, declare a new subclass of CommonAST and swipe the line numbers off the original token:

import antlr.CommonAST;
import antlr.Token;

public class CommonASTWithLines extends CommonAST {
     private int line = 0;
     private int column = 0;
     
     public void initialize(Token tok) {
         super.initialize(tok);
         line=tok.getLine();
         column=tok.getColumn();
     }
     public int getLine() { return line; }
     public int getColumn() { return column; }
}

Second, you’ll want to modify your tree grammar to actually use the line information in some way. To do this, you just need to call getLine() on the AST node (it’s part of the standard interface – just doesn’t work as you’d expect). If you’re inside a tree grammar rule, you can get the AST node for the token you’re matching with something like: n:STRING. You can then drop into a block and access the line number with c.getLine().

Finally, you need to tell your parser to use a customized ASTFactory that will emit your nodes instead of the default nodes:

MyLexer lexer = new MyLexer(reader);
    MyParser parser = new MyParser(lexer);

    // This is the new part - construct a factory and tell it
    // which class to use, then inject it into the parser
    ASTFactory factory = new ASTFactory();                         
    factory.setASTNodeClass(CommonASTWithLines.class);
    parser.setASTFactory(factory);

    parser.myrule();
    AST ast = parser.getAST();

That’s the basics. There are plenty of more complicated things to know (which I don’t) but this may be enough to get you started. Another good example you might want to look at is ANTLR Adder Tutorial that focuses on error reporting in an Antlr grammar.

Pure Danger Tech

Recovering line and column numbers in your Antlr AST