2004-11-17 23:53:41 通知 hamburger
题目：Corpus Linguistics and Its Applications
讲者：Professor Wolfgang Teubert
Wolfgang Teubert 教授简介：
Wolfgang Teubert, PhD (Heidelberg 1979)
Professor of Corpus Linguistics, University of Birmingham
Wofgang Teubert was, until 2000, a senior research fellow at the Institut für Deutsche Sprache (IDS), Mannheim, Germany. As Head of the Multilingual Research Unit, he was in charge, as the German partner, of important European projects including NERC, PAROLE, SIMPLE, and ELAN. He was the co-ordinator of the Concerted Action TELRI (1995 to 2002), an infrastructure projects involving 40 corpus research centres all over Europe. In 2000, he was appointed to the Chair of Corpus Linguistics, Department of English at the University of Birmingham. The focus of his research is the extraction of linguistic knowledge from real language data, particularly in multilingual environments with the emphasis on semantics. His other interest is the application of the methodology of corpus linguistics to critical discourse analysis. He is also the editor of the International Journal of Corpus Linguistics.
Corpus Linguistics and Its Applications
Corpus linguistics adds a new paradigm to linguistics in general. The categories it uses are based on real language data and not derived from introspection or tradition. This is what is now called the corpus-driven approach, as opposed to the corpus-based approach where corpus evidence is only used as examples to support claims made, e.g. in the cognitive paradigm. Corpus linguistics looks at language from a social perspective, as a social system enabling the distribution of content in a group.
Corpus linguistics is, thus, especially concerned with content, i.e. with meaning. It breaks with the Western linguistic tradition to take the single word for the core unit of meaning, replacing it with the concept of the lexical item, i.e. a monosemous unit of meaning consisting of one core element and as many collocates as needed to make this unit unambiguous. Thus corpus linguistics offers a solution to the problem of ambiguity. Meaning, for corpus linguistics, is inside the discourse. It is created by the negotiations of the discourse community, and it is identical with the reality created by the discourse. Natural language is auto-referential. It does not refer to some context-external reality. Meaning is evidenced, within the discourse, as usage and paraphrase. Each occurrence of a text segment can be viewed either as a token of a lexical item type (the synchronic perspective), or as a unique occurrence whose meaning can only be established by relating to a all the other previous and contemporaneous occurrences of similar text segments (the diachronic perspective). This perspective allows us to analyse and understand language change as the ongoing negotiation of the discourse reality, carried out by the members of the discourse community,.
Corpus linguistics thus can contribute to a wide range of applications. In human language technology, it provides a breakthrough by replacing the traditional polysemous word as the core unit with the normally rather complex, but monosemous, lexical unit. Where in traditional AI and MT applications the approach using conceptual ontologies has largely failed, corpus linguistics now enables working with real units of meaning. Corpus linguistics also breaks new ground in lexicography. While traditional dictionaries listed only those idioms which are a recognised part of the heritage of a language, it is now possible to identify all other units of meaning, including fixed expressions such as ‘friendly fire’. Furthermore, the definitions provided by lexicographers can now be replaced by the paraphrases the members of the discourse community themselves, thus aligning the dictionary more closely to actual language use. In bilingual lexicography, dictionaries working with the concept of (monosemous) translation units and their unique target language equivalents will finally enable their users to translate confidently also into their non-native language. The methods of corpus linguistics also help to identify and extract emergent terminology from scientific documents a long time before the standardisation of these new terms. Thus they help the distribution of new knowledge in scientific communities. Corpus linguistics is also contributing to language teaching/learning. A comparison of learners’ corpora with reference corpora will identify more clearly than any teacher could the problem areas evolving from the contrast of source language with the target language in question. Due to the incorporation of a diachronic perspective, the field of language and society is another new and increasingly important application of corpus linguistics. Discourse reality encompasses the beliefs and attitudes held in the discourse community, and these tend to change over time. Corpus linguistics makes it possible to observe the emergence of new, and the disappearance of old ideas as well as their gradual change. Corpus-driven Critical Discourse Analysis is thus becoming an indispensable tool for the cultural, social and political sciences.