The following review will assume a basic familiarity with RST rhetorical relations. However, a brief comparison of the SIL-like ``communication relations'' ([Larson, 1984] and [Beekman et al., 1981]) with RST relations is needed so that the realizations may be understood properly. Refer to section 3.4.1 for more information on the SIL theory.
[Larson, 1984] - SIL Semantics
As in RST, SIL relations connect proposition-sized and larger chunks of semantic text. Each relation specifies a head (nucleus) and support (satellite). Unfortunately, the relations this paper is focusing on, primarily from the ``cause cluster'' in RST, do not match up one-to-one with the relations in SIL. RST divides causal relations along the lines of volition and the nuclearity of the causing event:
DIFFERENCES IN NUCLEARITY
cause is satellite cause is nuclear
DIFFERENCES volitional cause | volitional result
IN --------------------------------------------
VOLITION non-volitional cause | non-volitional result
Volition is defined as involving ``the action of the agent... It is performed because the agent prefers the outcome or possible the action itself.'' Purpose, in RST, is ``definitionally neutral, including both volitional and non-volitional cases.'' The main difference between purpose and the other ``clause cluster'' relations is that the satellite situation in a Purpose relation is unrealized. The satellite is to be realized through the activity in the nucleus.
On the other hand, the SIL cause-related relations are differentiated in terms of the relationship between the causal and caused nodes and the intentionality of the agent. In the following, the head (nuclear) element is in capital letters:
Furthermore, the purpose-MEANS and means-RESULT can be differentiated by the fact that the purpose may or may not have been fulfilled, but the RESULT necessarily was.
Thus, the following is a mapping between RST and SIL relations:
RST SIL volitional result purpose-MEANS non-volitional result ??? volitional cause means-RESULT non-volitional cause reason-RESULT purpose purpose-MEANS
Larson gives a number of examples of each relation, but fails to specify why and when each is used. For each of the following, the supporting role will be in italics with the textual connector used to signal the relation (if any) in bold:
In summary, for each of the possible relations there exist a small set of textual connectors that may be used. The satellite may be placed before or after the nucleus. No information regarding how to choose between the various alternatives is given.
[Vander Linden et al., 1992]
This paper, ``The Expression of Local Rhetorical Relations in Instructional Text'' was the most helpful in sorting out the different choices available to a text generator and why one would be made over another. It is the backbone of the approach used in this project. Vander Linden's goal was very similar to that of this paper: ``to discover the principles which allow speakers/writers to navigate the lexicogrammatical resources of their language and produce utterances/texts which satisfy simultaneous goal at a variety of levels, both cognitive and social in nature.'' They employed the Penman text generation system ``because the systemic formalism provides a convenient and consistent architecture for the description and implementation of the links between communicative goals and linguistic forms.''
Vander Linden studied instructional text, which, unfortunately, is of a different nature than the expository text being translated by the DDTG. In applying some of the findings of this paper to expository text (as described further below), some awkwardness resulted. However, for the most part, the choices and decision procedures presented here transferred nicely. The instructional texts studied by Vander Linden were of the type found in manuals accompanying retail items such as cameras. They narrowed their study to texts with the following characteristics:
The relations studied were procedural sequences, purposes, preconditions and results. As this paper is concerned with causal relations, the review will be limited to the purposes and result relations.
PURPOSE
Their are two ``systems'' that are aimed at producing Purpose relations. The Purpose-Slot system addresses the ordering of the purpose clause before or after the actions it motivates. The Purpose-Form system chooses from among the following grammatical forms for the purpose:
The decision procedure used by the Purpose-Slot system follows:
default: purpose placed after action
exceptions:
1) scope of the purpose is multiple (number of actions it is
motivating is greater than one) (i.e. ``To end a call, hold
down FLASH for about two seconds, then release it.'')
2) ``optional'' or ``contrastive'' purpose (i.e. ``for more
information ..., or ``In order to prevent the memory
condition from occurring... (in the context of describing
the memory condition problem)''
The Purpose-Form system is a little more complicated:
1) scope of action > 1 -> to + inf
2) else default = prep phrase with nominalization (since purpose
clauses are not prescribed reader actions,
and thus should be demoted whenever possible.
exceptions:
a) if nominalization is complex (more than one argument or
a complex argument) or doesn't exist then
default: to + inf
exception 1: the infinitive requires unwanted arguments
(i.e ``Light comes on when the battery is weak. The
handset must be returned to the base for recharging'',
not ``to recharge the battery'')
exception 2: action in the main clause not sufficient to
achieve the purpose (i.e. ``OFF is used primarily for
recharging the battery'', instead of ``to recharge the
battery'', which might imply that OFF was the only thing
needed to recharge the battery.)
b) goal metonymy: object of verb more important than verb
itself -> ellipsis of verb (i.e. ``for frequently busy
numbers, you'll want to use redial'', instead of ``for
dealing with frequently busy numbers ...'')
Vander Linden identifies a few problems with their analysis. The first is the use of forms such as ``in order to'' instead of a simple ``to''. They were not able to determine the situations when one would be used over the other. In DDTG, formality and simplicity were used as the deciding criteria, with ``in order to'' being deemed more formal and less simple. Secondly, and perhaps more interesting, is their identification of the effects of symmetry on realization. For instance, in the phrase:
``to speak to a caller or to place a call (do this)''
was judged better than their system's output:
``to speak to a caller of for call placement (do this)''
Since the first disjunction, ``to speak to a caller'', was constrained to a to-plus-infinitive realization, the preferred rendering of the second disjunction is the one symmetrical to the first. This kind of inter-dependency is exactly what the DDTG was created for. The inability of Penman to handle such dependencies is its main drawback.
RESULT
The analysis of Results was not as helpful, because the influence of the instructional text domain was especially heavy. In particular, the Results being described here are meant to be obtained by the user. Thus, specifying which results are mandatory and which are obtained by someone other than the reader were of paramount importance. In the sample texts used in the DDTG, the results were all obtained by someone other than the reader.
Vander Linden found that the Result always was placed after the action to which it pertains. Although this is probably true in most cases, the simple example given by Larson, ``John washed the car because it was dirty'', shows that it is not always the case. More interesting is the analysis of possible forms the result clause can take:
Vander Linden also mentioned the problem of choosing appropriate sentence boundaries, giving their simple heuristic as:
1.1.3 [Delin et al., 1994] and [Delin et al., 1993]
Although Delin did have any specific contributions to the rules used in DDTG, they did make an important point concerning the nature of rhetorical relations which, I believe, will have a large impact on future research. Their finding was that different rhetorical structures can encode the same information. That is, someone with a specific communicative goal can often realize that goal using a variety of rhetorical structures. This can occur within a single language, such as the following two instructions found in the same manual referring to the same actions:
Number 1 is simply represented as a sequence of actions, whereas in number 2 a purpose relation is used. The effect is much more pronounced across languages:
In this example, the English rendition can best be described as a Means relation, the French as a Purpose, and the German as a Circumstance. Clearly, different rhetorical relations have been used to communicate the same information.
Delin proposes ``procedural relations'', which are higher level formulations of knowledge than rhetorical relations. In particular, they discuss the procedural relations of Generation and Enablement in [Delin et al., 1994]. Since these are abstractions of the basic units used in DDTG, they will not be discussed further.
1.1.4 [Scott and Souza, 1990]
Scott and Souza present a number of hypotheses and heuristics that are interesting:
In general, Scott and Souza's heuristics would make for a very conservative text. They attempt to take all of the uncertainty out of a text. Unfortunately, this would, in most instances, lead to a very boring style. Writers, and especially speakers, take much for granted. They leave a great deal of information to be inferred and seldom spell things out completely. In conversations, the speaker uses a minimum of specificity, and automatically detects and corrects misconceptions that may arise as a result. Since such correction is currently beyond the state of the art in text processing, a safer approach such as suggested here may have some merits.
One other point that this paper brings up is very important. They point out that a number of textual markers are so ambiguous as to almost be meaningless. ``And'', for instance, can be used for so many functions that, by itself, it hardly signals anything at all, except that something else is coming. It is important, therefore, to not only identify which textual markers can be used to realize a given relation, but to understand how each textual marker can be used to signal different relations, and how to avoid introducing unnecessary ambiguity.
Finally, Scott and Souza discuss the sentence boundary problem. ``The question of how to distribute the propositions of a message as sentences in the text is one which Hovy poses as one of the unresolved issues in paragraph planning.'' They conclude that a combination of factors influence sentence boundaries. These factors include the number of words, number of relations, number of propositions, syntactic constraints and textual balance.
1.1.5 [Paris and Scott, 1994]
Paris and Scott, although not directly applicable to the DDTG, seek decision procedures for their high level intentions in the same manner as this paper. They concentrate on the intentions of Ordering and Advising and come up with the following rules:
Ordering
The writer employs strongly directive speech acts such as ``order'' or ``prohibit''. Explanation rarely given.
Advising
The writer wants the reader to desire to perform or avoid some action. Writer will usually express the reason for performing an action. This gives rise to the relations ``background'', ``motivation'', ``result'', ``elaboration'' and ``condition''. Uses weakly directive speech acts such as ``recommend'' or ``instruct''.
Paris and Scott also point out that a text most often has a global structure. In instructional texts, a frequent global structure is as follows:
1.1.6 [Leech and Svartvik, 1975] and [Dixon, 1991].
These two books represent a novel approach to describing a language; specifically, they seek to explain how to communicate different semantic information using the lexicogrammatical resources of the language. Thus, these books are extremely relevant to the current topic. Unfortunately, because the books are aimed at a human audience (i.e. second language learners), the topic organization and presentation is very different from that seen in the papers reviewed above. For instance, information about rhetorical relations is sprinkled throughout and is not given that much emphasis. On the other hand, given time to wade through these books, a great deal of information could be gained that would be directly applicable to text generation. Most of the information on the causal relations under consideration here was also found in the other papers, and thus will not be expounded upon further.
Steve Beale