Wednesday, September 5, 2007

ANTLR

ANother Tool for Language Recognition (ANTLR) is the name of a parser generator that uses LL(k) parsing. ANTLR is the successor to the Purdue Compiler Construction Tool Set (PCCTS), first developed in 1989, and is under active development. Its maintainer is professor Terence Parr of the University of San Francisco.

It is the best modern tool for parsing and can handle a very wide range of grammars by generating arbitrarily deep LL look-aheads and by memorizing them for future use.

There is an IDE for ANTLR called ANTLRWorks and a text generation library called StringTemplate.

ANTLR and ANTLRWorks are written in Java but code generation is abstracted with the StringTemplate library, and lexers and parsers may be generated in a large number of target languages.

Why is ANTLR useful?

Consider the case of handling a quite complex mainframe output with decades of accreted special-case quirks and codes that has locked a client in to a supplier who charges them tens of thousands in annual license fees, confident that no one will breach the barrier to entry of parsing the mainframe data.

With ANTLR and a unit-testing framework you can develop a compiler for the mainframe output in a short time frame by unit-testing both your front-end analysis and your back-end generation from the bottom up, snapping them together as you go and relying on your lower-level tests to catch any mistakes.

You have liberated the client!

No comments:

Post a Comment