Earley's Algorthim

Earley's Algorthim

Left-right, "predictive", top down;
General (any CF grammar);
O(n³) worst case time.

The input X₁, ..., X_n is scanned left-right, looking ahead a fixed number k of symbols.
As each X_i is read, a set of states S_i is constructed, representing the condition of the recognition process at this stage. Each state in the set represents:
1. a rule of the grammar;
2. a point in the rule indicating how much of its rhs has been recognized (a dot);
3. an indication of the point at which recognition of the lhs began (the statenumber or "origin state");
4. (lookahead: a k-symbol string which is syntactically possible following the rule).

State	Dotted Rule	Origin	Lookahead
0	S -> . NP VP	0
2	VP -> V . NP	1

States are constructed via three operations:

Predict
Scan
Complete

Each is applied repeatedly.

Predictor

In a state where there is a non-terminal to the right of the dot: add a new state to S_i for each alternative for that non-terminal. For example:

S -> NP . VP

Predict adds:

VP -> . V
VP -> . V NP
VP -> . V NP PP
etc.

(Repeat until no further states can be added; cf. Complete in LR(k) parsing).

Scanner

In a state S_i where there is a terminal to the right of the dot: compare this to the next input symbol; if they match, add to S_i+1 this configuration with the dot moved over the terminal. For example, if S₂ contains () , and input X₃ is saw, then add () to S₃:

V -> . likes
V -> likes .

Completer

In a state S_i where the dot is at the end of the rule whose lhs is A: the lookahead string (if any) is compared with the next k input symbols, if they match, the algorithm copies from the "origin state", any dotted rules with A after the dot, and adds them to S_i, with the dot moved over A. For example, suppose the () appears in S₂, and its origin state is S₁:

V -> likes .

we copy from S₂ all the rules that have V after the dot:

VP -> . V
VP -> . V NP
VP -> . V NP PP

and add versions of these to S₂, with the dot moved:

VP -> V .
VP -> V . NP
VP -> V . NP PP

doug@essex.ac.uk

Earley's Algorthim