The Underlying theory: Rhetorical Structure Theory

Next: The relations used in Up: Text Structure Previous: Text Structure

The Underlying theory: Rhetorical Structure Theory

[Taken from Ali Knott's PhD, chapter 2.]

Rhetorical structure theory (RST), developed mainly by William Mann and Sandra Thompson, is presented in a number of papers. This description draws for the most part on the account in Mann and Thompson [], which is the most comprehensive.

The central constructs in RST are rhetorical relations. Text coherence is attributed principally to the presence of these relations; unlike Grosz and Sidner, Mann and Thompson do not envisage an important role for other constructs such as focus. The claim is that the relations in RST suffice to analyse `the vast majority' of English texts; exceptions are only made for very unusual texts like poems and legal documents.

Rhetorical relations are defined functionally, in terms of the effect the writer intends to achieve by presenting two text spans side by side. In this respect, they resemble Grosz and Sidner's relations. However, there are also several differences between the two types of relation.

Firstly, RST relations do in fact make some reference to the propositional content of spans, as well as to the intentions of the writer in putting them forward. For instance, the MOTIVATION relation specifies that one of the spans `presents' an action to be performed by the reader; the SEQUENCE relation specifies that a succession relationship must exist between the related spans. RST relations are in fact defined using five fields--only one of these explicitly represents the effect of the relation; the others represent the various different constraints that must be satisfied in order to achieve this effect, and these are specified using a mixture of propositional and intentional language.

Secondly, Mann and Thompson go out of their way to rule out a connection between the set of relations and the linguistic devices used to signal them. This goes beyond the claim that the relations in a text need not be signalled--they further suggest that

some types of rhetorical relations have no corresponding conjunctive signals.
Mann and Thompson [], p45 (my italics)

In this, their theory differs from Grosz and Sidner's (and many others besides), in which at least an informal link is made between underlying relations and the linguistic devices for marking them.

A third novel feature of RST is its concept of nuclearity. As well as representing the relationship between two text spans, rhetorical relations also convey information about which span is more central to the writer's purposes. The nucleus is the more central span, and the satellite is the less central one. Mann and Thompson contend that the majority of text is structured using nucleus-satellite relations; although some relations--termed multinuclear--do not exhibit it. (There are two multinuclear relations: SEQUENCE and CONTRAST.)

The nucleus-satellite distinction is in some ways comparable to the PARATACTIC-HYPOTACTIC distinction of Grimes and others. But while these are expressed in semantic or even syntactic terms, RST's definition is functional, based on the idea that a writer has more important and less important goals when she sets out to create a text. Nucleus-satellite relations are in fact more reminiscent of Grosz and Sidner's class of DOMINANCE relations. But even here there is a difference: in Grosz and Sidner's model it is hard to talk about the purpose of the dominant span being `more central' to the writer than that of the subordinate span, because the former purpose actually includes the latter.

RST provides a set of around 23 rhetorical relations. The numbers vary slightly from paper to paper, but the central core of relations as presented in Mann and Thompson [] are given in Figure 5.

Figure 5: Mann and Thompson's Relations

The top-level distinction in this taxonomy is between SUBJECT-MATTER and PRESENTATIONAL relations. SUBJECT-MATTER relations have as their effect that the reader recognise the relation in question; while PRESENTATIONAL relations have as their effect to increase some inclination in the reader. Thus SEQUENCE is a SUBJECT-MATTER relation (its effect is that the reader recognise that the two related spans present events occurring in sequence) and MOTIVATION is PRESENTATIONAL (its effect is to increase the reader's motivation to perform the action presented in the nucleus span). To some extent, this distinction mirrors Halliday and Hasan's distinction between INTERNAL and EXTERNAL relations. But again, the similarity is far from complete.

Like the other computational theories of relations, RST has a strong structural account of text. It begins with an independent definition of `text span'--for Mann and Thompson, the size of the atomic units of text analysis is arbitrary, but they should have independent functional integrity. The clause is selected as the minimal unit of organisation; thus text spans are clauses, or larger units composed of clauses. Unlike Grosz and Sidner, relations must hold between non-overlapping text spans. (An exception to this rule is made for non-restrictive relative clauses: relations are permitted to hold between a matrix clause and a subordinate clause.)

In RST, relations are not mapped directly onto texts; they are fitted onto structures called schema applications, and these in turn are fitted to text. Schema applications are derived from simpler structures called schemas (see Figure 6).

Figure 6: The Types of Schema in RST

In this diagram, horizontal lines depict text spans, the labelled lines depict relations between spans, nuclei are picked out by the vertical lines (they are diagonal for multinuclear relations), and all other spans are satellites. From these structures, schema applications are formed, by rearranging the spans in any order and by duplicating spans any number of times. (For the schemas with satellites, only the satellite spans can be duplicated.) Relations are then fitted to the schema applications thus formed--relations which take a nucleus and a satellite are fitted to the single or dual relation schema applications, and the specialised CONTRAST and SEQUENCE relations are fitted to the individual schemas (b) and (e) respectively. The `joint' schema is for linking pieces of text which are not linked by any RST relations, and is essentially used for representing lists.

A rhetorical structure tree is a hierarchical system of schema applications. A schema application links a number of consecutive spans, and creates a complex span which can in turn be linked by a higher level schema application. This enables tree structures to be built--it is a central claim of RST that the structure of every coherent discourse can be described by a single rhetorical structure tree, whose top schema application creates a span encompassing the whole discourse.

RST has proved a very influential theory amongst computational linguists, as the next section will attest. Its popularity is perhaps best attributed to a combination of features: the emphasis on a functional conception of relations; the carefully presented set of relation definitions; the simply stated structural theory. It is doubtful whether anyone believes the claims made in RST--but at least it is clearly enough expressed for people to be able to frame their objections to it.

Next: The relations used in Up: Text Structure Previous: Text Structure

ilex