In this section we informally describe the Earley deduction method introduced
in [Pereira
and Warren1983]. Earley deduction is a proof procedure for definite clauses
and is named after Earley's context-free parsing algorithm. As in Earley's
original algorithm, in Earley deduction the processing of definite clauses is
split into two basic deduction steps or rules of inference, namely
prediction and completion
, dealing
respectively with top-down predictions of new clauses and bottom-up combination
of existing clauses. The results of prediction and completion (the derived
clauses) are lemmas, i.e., logical consequences of the program.
Earley deduction operates on two sets of definite clauses, called the
program and the state. In our case the program just represents
the grammar and lexicon and remains fixed. The state set, on the other hand,
will be continually augmented with new lemmas. Whenever a new non-unit lemma is
added, one of its negative literals is selected (for example, when using the
basic Prolog strategy, the selected element would always be the leftmost literal
in the body of a lemma).
The prediction rule operates on non-unit lemmas, which we also call
active lemmas. This rule selects an active lemma B from the
current state and searches for a program clause C whose positive literal
(i.e., the head of C) unifies with the selected literal of B. If
this is possible the thereby derived clause
is added to the
state set, where
is the most general
unifier of the two literals. The selected element of B is said to
instantiate the program clause C to
, or in other
words, the selected element has been used to predict an instantiation of
C. Clearly, prediction thus described realizes the top-down step of
Earley deduction.
The completion rule operates on unit lemmas, which we also call passive
lemmas. An active lemma C is chosen from the current state set,
whose selected element unifies with the passive lemma, which (if possible)
yields a new resolvent
, where C'
is C minus its selected element. Since the completor actually reduces the
number of the negative literals of an active lemma, repeated application of the
completion rule eventually creates passive lemmas. Note that completion against
the selected literal is sufficient, because if a completion with some other
literal of the body is required, any selection function will sooner or later
come to this literal, because the selection function will have to select from
fewer and fewer literals (see also [Pereira
and Shieber1987], page 199). Hence, the completor is also said to
reduce C to
or in
other words, the completion step (partially) completes derived clauses, in a
bottom-up fashion.
In [Pereira and Shieber1987] it is shown that the more specific constraints of context-free parsing allow a simplification that we cannot take advantage of here. In Earley's parsing algorithm, derived clause creation proceeds strictly from left to right. This means that any passive lemma needed to resolve against some active lemma is guaranteed to be constructed after the active lemma is created. Therefore, to perform all the pertinent resolutions, the algorithm need only look for active lemmas at the time when a passive lemma is created. However, a general Earley deduction proof procedure cannot guarantee this, so it is necessary to apply the completor also when active lemmas are created. As we will show, this is specifically the case for generation, because the structure of the input is not a sequence but a tree-like structure. In the implementation described in the next sections, we will separate these two cases into two inference rules, called passive-completion and active-completion, respectively.
Guenter Neumann