Grammar in Bison
Previous: <Language and Grammar=>Languagean> * Next: <Semantic Values=>SemanticVa> * Up: <Concepts=>Concepts>

#Wrap on
{fH3}From Formal Rules to Bison Input{f}

A formal grammar is a mathematical construct.  To define the language
for Bison, you must write a file expressing the grammar in Bison syntax:
a {fUnderline}Bison grammar{f} file.  \*Note <Grammar File=>GrammarFil>: Bison Grammar Files.

A nonterminal symbol in the formal grammar is represented in Bison input
as an identifier, like an identifier in C.  By convention, it should be
in lower case, such as {fCode}expr{f}, {fCode}stmt{f} or {fCode}declaration{f}.

The Bison representation for a terminal symbol is also called a {fUnderline}token
type{f}.  Token types as well can be represented as C-like identifiers.  By
convention, these identifiers should be upper case to distinguish them from
nonterminals: for example, {fCode}INTEGER{f}, {fCode}IDENTIFIER{f}, {fCode}IF{f} or
{fCode}RETURN{f}.  A terminal symbol that stands for a particular keyword in
the language should be named after that keyword converted to upper case.
The terminal symbol {fCode}error{f} is reserved for error recovery.
\*Note <Symbols=>Symbols>.

A terminal symbol can also be represented as a character literal, just like
a C character constant.  You should do this whenever a token is just a
single character (parenthesis, plus-sign, etc.): use that same character in
a literal as the terminal symbol for that token.

The grammar rules also have an expression in Bison syntax.  For example,
here is the Bison rule for a C {fCode}return{f} statement.  The semicolon in
quotes is a literal character token, representing part of the C syntax for
the statement; the naked semicolon, and the colon, are Bison punctuation
used in every rule.

#Wrap off
#fCode
stmt:   RETURN expr ';'
        ;
#f
#Wrap on

\*Note <Rules=>Rules>: Syntax of Grammar Rules.

