LBURG(1)                   Unix Programmer's Manual                   LBURG(1)


NAME
     lburg - lcc's code-generator generator

SYNOPSIS
     lburg [ option ]...  [ [ input ] output ]

DESCRIPTION

     lburg reads an lcc-style BURG  specification  from  input  and  writes  a
     pattern-matching  code  generator  to  output.   If  input  is  `-' or is
     omitted, lburg reads the standard input; If output is `-' or is  omitted,
     lburg writes to the standard output.

     lburg accepts specifications that conform to the following EBNF  grammar.
     Terminals  are  enclosed  in single quotes or are given in uppercase, all
     other symbols are nonterminals or English phrases, {X}  denotes  zero  or
     more instances of X, and [X] denotes an optional X.

          spec:     `%{' configuration `%}' { dcl } `%%' { rule }
                         [ `%%' C code ]

          dcl:      `%start' nonterm
                    `%term' { ID `=' INT }

          rule:     nonterm `:' tree template [ C expression ]

          tree:     term `(' tree `,' tree `)'
                    term `(' tree `)'
                    term
                    nonterm

          nonterm:  ID

          template: `"' { any character except double quote } `"'

     Specifications are structurally similar to yacc's.  Text between `%{' and
     `%}'  is  called  the  configuration  section;  there may be several such
     segments.  All are concatenated and copied verbatim into the head of  the
     output.  Text after the second `%%', if any, is also copied verbatim into
     the output, at the end.

     Specifications consist of declarations,  a  `%%'  separator,  and  rules.
     Input  is  line-oriented;  each  declaration  and  rule  must appear on a
     separate line, and declarations must begin  in  column  1.   Declarations
     declare  terminals  --  the operators in subject trees -- and associate a
     unique, positive external symbol number with each one.  Nonterminals  are
     declared  by  their  presence  on  the  left  side  of rules.  The %start
     declaration optionally declares a nonterminal as the  start  symbol.   In
     the grammar above, term and nonterm denote identifiers that are terminals
     and nonterminals.

     Rules define tree patterns in a fully parenthesized  prefix  form.  Every
     nonterminal  denotes  a  tree.  Each operator has a fixed arity, which is
     inferred from the rules in which it is used.  A  chain  rule  is  a  rule
     whose  pattern  is  another nonterminal.  If no start symbol is declared,
     the nonterminal defined by the first rule is used.


                              local \- 11/30/94                              1



LBURG(1)                   Unix Programmer's Manual                   LBURG(1)


     Each rule ends with an expression that computes the cost of matching that
     rule;  omitted  costs  default  to  zero.  Costs  of  chain rules must be
     constants.

     The configuration section configures  the  output  for  the  trees  being
     parsed  and the client's environment.  As shown, this section must define
     NODEPTR_TYPE to be a visible typedef symbol for a pointer to  a  node  in
     the  subject tree.  The labeller invokes OP_LABEL(p), LEFT\_CHILD(p), and
     RIGHT\_CHILD(p) to read the operator and children from the  node  pointed
     to  by  p.   If  the  configuration  section  defines these operations as
     macros, they are implemented in-line; otherwise, they must be implemented
     as functions.

     The matcher computes and stores a single integral state in each  node  of
     the  subject  tree.   The  configuration  section  must  define  a  macro
     STATE_LABEL(p) to access the state field of the node pointed to by p.  It
     must  be  large enough to hold a pointer, and a macro is required because
     it is used as an lvalue.

OPTIONS

     -p\fBprefix
     -pprefix
          Use prefix as  the  disambiquating  prefix  for  visible  names  and
          fields.  The default is `_'.

     -T   Arrange for

              void _trace(NODEPTR_TYPE p, int eruleno,
                              int cost, int bestcost);

          to be called at each successful match.  p identifies  the  node  and
          eruleno  identifies  the  matching  rule;  the  rules  are  numbered
          beginning at 1 in the order they appear in the input.  cost  is  the
          cost  of  the  match  and  bestcost is the cost of the best previous
          match. The current match wins only if cost is  less  than  bestcost.
          32767  represents  the  infinite  cost of no previous match.  _trace
          must be declared in the configuration section.

SEE ALSO
     lcc(1)

     C.        W.        Fraser        and        D.        R.         Hanson,
     ARetargetableCCompiler:DesignandImplementation,        Benjamin/Cummings,
     Redwood City, CA, 1995, ISBN 0-8053-1670-1. Chapter 14.

     C. W. Fraser, D. R. Hanson and T. A. Proebsting, `Engineering  a  simple,
     efficient code generator generator,' ACM Letters on Programming Languages
     and Systems 1, 3 (Sep. 1992), 213-226.

BUGS
     Mail bug reports along with the  shortest  input  that  exposes  them  to
     drh@cs.princeton.edu.





                              local \- 11/30/94                              2
