AdaGOOP Home Page

The content of this page in no way reflects the opinions, standards, or policy of the United States Air Force Academy or the United States government.
Introduction

Welcome to the AdaGOOP home page. AdaGOOP is the Ada Generator of Object-Oriented Parsers. Using AdaGOOP, you can easily generate a parser which builds an object-oriented parse tree. Read here the Ada Letters paper (June 2000) describing AdaGOOP.

New Release

This new release (posted 15 January 2008) includes an Ada 2005 sample.

Getting Started

To use AdaGOOP, you will need the following tools:

scaflex and scayacc. From the SCATC Toolkit by Richard Conn. My copy is here.
GNAT compiler (Click here for public release download site). Other Ada 95 compilers may be used for the generated code, however the AdaGOOP sources utilize the 'Unrestricted_Access GNAT-specific attribute. It is possible to modify generate.adb to avoid this.
sed (built-in UNIX command, also available on Windows). This is used in the batch files in the samples to delete the first line of an automatically generated file. It is possible to simply delete the first line of this file by hand, so this tool isn't strictly necessary.

AdaGOOP sources

AdaGOOP 2005 is an Ada 2005 automatic parser generator built on top of the SCATC versions of aflex and ayacc. As SableCC does, AdaGOOP automatically generates code to build a parse tree, and a traversal of the parse tree using the visitor pattern. It is currently available at: http://adagoop.martincarlisle.com

Using AdaGOOP:

AdaGOOP is distributed as Ada 2005 source code. To use the tool, simply download and compile the source (adagoop.adb contains the main procedure). AdaGOOP outputs will be run through SCAFLEX and SCAYACC. Source code for these tools is distributed in subfolders of the same names (scaflex.adb and scayacc.adb are the main procedures).

AdaGOOP is executed from a command prompt as follows:

adagoop input_file prefix (e.g. adagoop ada05.g ada05)

Prefix should be a simple name for the project. This name will be used several places in the output files.

The input file to AdaGOOP consists of three parts: token_macros, tokens, and the grammar. Token macros are regular expressions that form parts of tokens. For example,

token_macros
DIGIT [0-9]
INTEGER ({DIGIT}+)

defines two macros, DIGIT and INTEGER, that can be used to create other tokens. Tokens are formed in the same fashion; however, their regular expressions can only contain token macros, not other tokens. The regular expression operators supported are:

[0-9A-FZ]    : any 1 character ‘0’-‘9’,‘A’-‘F’,or ‘Z’
[^xyz]       : any character except x, y or z
\n           : new line
.            : any single character
\.           : the character ‘.’ (similar for \”, \[, …)
x*           : 0 or more occurrences of x
x+           : 1 or more occurrences of x
x?           : 0 or 1 occurrences of x
x|y              : either x or y
“x*”         : literally “x*” (ignore operators)
{MACRO}      : replace with definition of MACRO
()           : parentheses, as (zy)*

Comments should be defined with the token “ignore”, e.g.

ignore "--"[^\n]*

Both tokens and token macros should be specified 1 per line. Comments may be interspersed with --. This is why the ignore token above must have the -- in quotation marks.

The final section, “grammar” gives an LALR(1) grammar. It should begin with a %start, indicating which non-terminal symbol is the starting symbol of the grammar, e.g.

grammar
%start A
A : A token1 B | token2 | ;
B : token2;

Each grammar rule consists of a non-terminal name, followed by a colon, followed by 1 or more sequences of terminals and non-terminals, separated by ‘|’. Note that a null production is specified by simply having one of these sequences be empty. The rule is terminated with a semicolon. White space is irrelevant. A complete sample file is given in Appendix A.

AdaGOOP output:

Throughout this section, “prefix” refers to the command-line argument passed to AdaGOOP. AdaGOOP generates prefix.l, an input file for scaflex. This file contains the specifications of tokens. The output file prefix.y is an input file for scayacc. It contains the grammar, plus code to build an object-oriented parse tree.

Once AdaGOOP has been run, you can create the lexer and parser by running (for our example):

     scaflex prefix.l
     gnatchop -w prefix.a
     gnatchop -w prefix_io.a
     gnatchop -w prefix_dfa.a
     scayacc prefix.y
     gnatchop -w prefix.a
     gnatchop -w prefix_goto.a
     sed -e 1d prefix_tokens.a > prefix_modified_tokens.a
     gnatchop –w prefix_modified_tokens.a
     gnatchop -w prefix_shift_reduce.a

Note gnatchop is needed only if your compilation environment (such as GNAT) requires package specifications and bodies to appear in separate files. sed is used only to delete the first line of prefix_tokens.a. We needed to add with context clauses to this file, but they appeared after the package declaration. To resolve this, we added the context clauses, another package declaration, and within a batch script used sed to delete the first package declaration.

For the object-oriented parse tree, AdaGOOP creates an abstract object, Parseable, that will serve as the root of our hierarchy (it has no fields):

type Parseable is abstract tagged null record;

We also create a single object type, Parseable_Token, to handle all terminal symbols:

   type Parseable_Token is new Parseable with record
      Line         : Natural;
      Column       : Natural;
      Token_String : String_Ptr;
      Token_Type   : Token; -- enumeration of all tokens
   end record;

Both Parseable and Parseable_Token are found in the package Prefix_Model. A production A => B C, where B and C are non-terminals, corresponds to a node in the parse tree containing pointers to the subtrees derived through B and C. We therefore create the following class declaration (which will appear in the package specification a_model.ads):

     type A_nonterminal is new Parseable with record
     B_part : access B_nonterminal’Class;
     C_part : access C_nonterminal’Class;
end record;

Since we have a single class for all terminal symbols, if we have A => B terminal1 C, we simply add a pointer to a Parseable_Token as the second field of the record. The parse tree then contains all of the information to completely recreate the original source text, with the exception of comment tokens (which are ignored).

To address the issue of multiple productions for a single non-terminal, we insert an abstract class for the non-terminal, and have concrete classes corresponding to each possible production. For example, if we have A => B terminal1 C | D, then AdGOOP outputs the following record declarations in the package A_Model:

     type A_nonterminal is abstract new Parseable
      with null record;
type A_nonterminal_Ptr is access all A_nonterminal’Class;
type A_nonterminal1 is new A_nonterminal with record
     B_part : access B_nonterminal’Class
     Terminal1_Part : Parseable_Token_Ptr;
     C_part : access C_nonterminal’Class;
end record;
type A_nonterminal2 is new A_nonterminal with record
     D_part : access D_nonterminal’Class;
end record;

The tool also will automatically number the fields in the case where a non-terminal or terminal appears more than once on the right hand side of a production. So, if we had A => B terminal1 B instead, the record would look like:

type A_nonterminal1 is new A_nonterminal with record
     B_part1 : access B_nonterminal’Class
     Terminal1_Part : Parseable_Token_Ptr;
     B_part2 : access B_nonterminal’Class;
end record;

AdaGOOP also generates code for traversing the parse tree using the Visitor pattern. Each class in the parse tree hierarchy has an acceptor method, which is used to dispatch to the appropriate visitor method:

   procedure Acceptor(This : access Parseable;
      I : access prefix_Visitor_Interface.
      Visit_prefix_Interface'Class) is abstract;

This automatically generated interface specifies a visitor method that has no additional parameters and does not return a value. It is output to the file prefix_visitor_interface.ads.

As can be seen from the following sample body (automatically generated), Acceptor merely calls the appropriate visitor method found in the interface:

   procedure Acceptor(This : access A_nonterminal2;
      I : access Visit_prefix_Interface'Class) is
   begin
      I.Visit_A_nonterminal2(This);
   end Acceptor;

Visit_prefix_Interface contains one visitor method for each leaf of the hierarchy, plus supports both pre-order and post-order depth first traversals by providing “before” and “after” methods that can be overloaded:

limited with A_Model;
limited with prefix_Model;
package prefix_Visitor_Interface is
type Visit_prefix_Interface is interface;

   procedure Visit_Parseable_Token(
      I : access Visit_prefix_Interface;
      T : access prefix_Model.Parseable_Token'Class) is null;

   procedure Before_A_nonterminal1(
      I : access Visit_prefix_Interface;
      N : access A_Model.A_nonterminal1'Class) is null;
   procedure Visit_A_nonterminal1(
      I : access Visit_prefix_Interface;
      N : access A_Model.A_nonterminal1'Class) is abstract;
   procedure After_A_nonterminal1(
      I : access Visit_prefix_Interface;
      N : access A_Model.A_nonterminal1'Class) is null;

   procedure Before_A_nonterminal2(
      I : access Visit_prefix_Interface;
      N : access A_Model.A_nonterminal2'Class) is null;
   procedure Visit_A_nonterminal2(
      I : access Visit_prefix_Interface;
      N : access A_Model.A_nonterminal2'Class) is abstract;
   procedure After_A_nonterminal2(
      I : access Visit_prefix_Interface;
      N : access A_Model.A_nonterminal2'Class) is null;
…
end prefix_Visitor_Interface;

A depth first traversal class, DFS, is generated into the package prefix_DFS. DFS implements the visitor interface, and its visit methods perform a depth first traversal of the parse tree. For example, following is the visitor method corresponding to the grammar rule A ® A PROCEDURE B:

   procedure Visit_A_nonterminal1(
         I : access DFS;
         N : access A_Model.A_nonterminal1'Class) is
      I_Classwide : access DFS'Class := I;
   begin
      I_Classwide.Before_A_nonterminal1(N);
      N.A_part.Acceptor(I);
      I_Classwide.Visit_Parseable_Token(N.PROCEDURE_part);
      N.B_part.Acceptor(I);
      I_Classwide.After_A_nonterminal1(N);
   end Visit_A_nonterminal1;

This visitor method calls the “before” method (for pre-order traversals), then calls Acceptor for each of the children to dispatch to the appropriate visitor, and finally calls the “after” method (for post-order traversals). Tokens don’t require dispatching, so the visit method can be called directly. It is interesting to note that Ada, unlike Java, does not perform dynamic dispatching by default. Therefore, it is necessary to convert the parameter I to a classwide type before calling Acceptor on each child so that dispatching occurs.

To create your own visitor, you simply create a child class for DFS. In this child class, you override the “before” and “after” methods for those parse tree classes that you wish to process:

limited with A_Model;

with Prefix_DFS;

package DFS_Example is

type DFS is new Prefix_DFS.DFS with null record;

   overriding procedure After_A_nonterminal1(
      I : access DFS;
      N : access A_Model.A_nonterminal1'Class);
   overriding procedure Before_A_nonterminal2(
      I : access DFS;
     N : access A_Model.A_nonterminal2'Class);

…
end DFS_Example;

Creating a main program for a tool generated with AdaGOOP is simple. You need simply call the “run” procedure in the prefix_parser package with a filename, then use get_parse_tree to get the root of resulting parse tree. The depth first traversal is started by calling the method Acceptor on the root of the tree.

with Ada.Command_Line;
use Ada.Command_Line;
with ada.text_io;
use ada.text_io;
with prefix_parser;
with prefix_model;
with DFS_Print;
with a_model;

procedure tool is
   type DFS_Access is access all DFS_Print.DFS'Class;
   Parse_Tree : prefix_model.parseable_ptr;
   Visitor : DFS_Access := new DFS_Print.DFS;

begin
   if argument_count /= 1 then
      put_line("usage: tool filename");
      return;
   end if;

   prefix_parser.run(argument(1));
   parse_tree := prefix_parser.get_parse_tree;
   parse_tree.Acceptor(Visitor);
end tool;

Feedback

Complete sources are available. If you have suggestions for improvement, or (better yet), contributions, please send them to Martin Carlisle.

Copying

AdaGOOP is distributed freely by the Department of Computer Science, United States Air Force Academy. There are no restrictions on its use, although we do request that you keep the author information intact, and clearly indicate if you have modified the sources.

AdaGOOP is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.