What is BNF notation?
BNF is an acronym for "Backus Naur Form". John
Backus and Peter Naur introduced for the first time a formal notation to
describe the syntax of a given language (This was for the description of the ALGOL 60 programming
language, see [Naur
60]). To be precise, most of BNF was introduced by Backus in a report
presented at an earlier UNESCO conference on ALGOL 58. Few read the report, but
when Peter Naur read it he was surprised at some of the differences he found
between his and Backus's interpretation of ALGOL 58. He decided that for the
successor to ALGOL, all participants of the first design had come to recognize
some weaknesses, should be given in a similar form so that all participants
should be aware of what they were agreeing to. He made a few modificiations that
are almost universally used and drew up on his own the BNF for ALGOL 60 at the
meeting where it was designed. Depending on how you attribute presenting it to
the world, it was either by Backus in 59 or Naur in 60. (For more details on
this period of programming languages history, see the introduction to Backus's
Turing award article in Communications of the ACM, Vol. 21, No. 8, august 1978.
This note was suggested by William B.
Clodius from Los Alamos Natl. Lab).
Since then, almost every author of books on new programming languages used it
to specify the syntax rules of the language. See [Jensen
74] and [Wirth
82] for examples.
The following is taken from [Marcotty
86]:
The meta-symbols of BNF are:
- ::=
- meaning "is defined as"
- |
- meaning "or"
- < >
- angle brackets used to surround category names.
The angle brackets
distinguish syntax rules names (also called non-terminal symbols) from terminal
symbols which are written exactly as they are to be represented. A BNF rule
defining a nonterminal has the form: nonterminal ::= sequence_of_alternatives consisting of strings of
terminals or nonterminals separated by the meta-symbol |
For example, the BNF production for a mini-language is: <program> ::= program
<declaration_sequence>
begin
<statements_sequence>
end ;
This shows that a mini-language program consists of the keyword "program"
followed by the declaration sequence, then the keyword "begin" and the
statements sequence, finally the keyword "end" and a semicolon.
(end of quotation)
In fact, many authors have introduced some slight extensions of BNF for the ease
of use:
- optional items are enclosed in meta symbols [ and ],
example:
<if_statement> ::= if <boolean_expression> then
<statement_sequence>
[ else
<statement_sequence> ]
end if ;
- repetitive items (zero or more times) are enclosed in meta symbols
{ and }, example:
<identifier> ::= <letter> { <letter> | <digit> }
this rule is equivalent to the recursive rule: <identifier> ::= <letter> |
<identifier> [ <letter> | <digit> ]
- terminals of only one character are surrounded by quotes (") to
distinguish them from meta-symbols, example:
<statement_sequence> ::= <statement> { ";" <statement> }
- in recent text books, terminal and non-terminal symbols are distingued by
using bold faces for terminals and suppressing < and > around
non-terminals. This improves greatly the readability. The example then
becomes:
if_statement ::= if boolean_expression then
statement_sequence
[ else
statement_sequence ]
end if ";"
Now as a last example (maybe not the easiest to read !), here is
the definition of BNF expressed in BNF: syntax ::= { rule }
rule ::= identifier "::=" expression
expression ::= term { "|" term }
term ::= factor { factor }
factor ::= identifier |
quoted_symbol |
"(" expression ")" |
"[" expression "]" |
"{" expression "}"
identifier ::= letter { letter | digit }
quoted_symbol ::= """ { any_character } """
BNF is not ony important to describe syntax rules in books, but it is very
commonly used (with variants) by syntactic tools. See for example any book on LEX
and YACC, the standard UNIX parser generators. If you have access to any
Unix machine, you will probably find a chapter of the documentation on these
tools.
Some references:
- [Naur 60]
- NAUR, Peter (ed.), "Revised Report on the Algorithmic Language ALGOL 60.",
Communications of the ACM, Vol. 3 No.5, pp. 299-314, May 1960.
- [Jensen 74]
- JENSEN, Kathleen, WIRTH, Niklaus, "PASCAL user manual and report",
Lecture notes in computer science ; vol. 18., Berlin [etc.] :
Springer, 1974., 1974.
- [Johnson 75]
- S.C. Johnson, "Yacc: Yet Another Compiler Compiler", Computer Science
Technical Report #32, Bell Laboratories, Murray Hill, NJ, 1975.
- [Wirth 82]
- WIRTH, Niklaus., Programming in Modula-2, Berlin, Heidelberg:
Springer, 1982.
- [Marcotty 86]
- M. Marcotty & H. Ledgard, The World of Programming Languages,
Springer-Verlag, Berlin 1986., pages 41 and following.
Th. Estier, CUI - University
of Geneva