Automatic acquisition of knowledge about discourse connectives

Hutchinson, Ben

Automatic acquisition of knowledge about discourse connectives

Simple item page

dc.contributor.advisor

Lascarides, Alex

en

dc.contributor.advisor

Lapata, Mirella

en

dc.contributor.author

Hutchinson, Ben

en

dc.date.accessioned

2005-11-17T12:27:25Z

dc.date.available

2005-11-17T12:27:25Z

dc.date.issued

2005-12

dc.description

Institute for Communicating and Collaborative Systems

en

dc.description.abstract

This thesis considers the automatic acquisition of knowledge about discourse connectives. It focuses in particular on their semantic properties, and on the relationships that hold between them. There is a considerable body of theoretical and empirical work on discourse connectives. For example, Knott (1996) motivates a taxonomy of discourse connectives based on relationships between them, such as HYPONYMY and EXCLUSIVE, which are defined in terms of substitution tests. Such work requires either great theoretical insight or manual analysis of large quantities of data. As a result, to date no manual classification of English discourse connectives has achieved complete coverage. For example, Knott gives relationships between only about 18% of pairs obtained from a list of 350 discourse connectives. This thesis explores the possibility of classifying discourse connectives automatically, based on their distributions in texts. This thesis demonstrates that state-of-the-art techniques in lexical acquisition can successfully be applied to acquiring information about discourse connectives. Central to this thesis is the hypothesis that distributional similarity correlates positively with semantic similarity. Support for this hypothesis has previously been found for word classes such as nouns and verbs (Miller and Charles, 1991; Resnik and Diab, 2000, for example), but there has been little exploration of the degree to which it also holds for discourse connectives. We investigate the hypothesis through a number of machine learning experiments. These experiments all use unsupervised learning techniques, in the sense that they do not require any manually annotated data, although they do make use of an automatic parser. First, we show that a range of semantic properties of discourse connectives, such as polarity and veridicality (whether or not the semantics of a connective involves some underlying negation, and whether the connective implies the truth of its arguments, respectively), can be acquired automatically with a high degree of accuracy. Second, we consider the tasks of predicting the similarity and substitutability of pairs of discourse connectives. To assist in this, we introduce a novel information theoretic function based on variance that, in combination with distributional similarity, is useful for learning such relationships. Third, we attempt to automatically construct taxonomies of discourse connectives capturing substitutability relationships. We introduce a probability model of taxonomies, and show that this can improve accuracy on learning substitutability relationships. Finally, we develop an algorithm for automatically constructing or extending such taxonomies which uses beam search to help find the optimal taxonomy.

en

dc.format.extent

1197743 bytes

en

dc.format.mimetype

application/pdf

en

dc.identifier.uri

http://hdl.handle.net/1842/852

dc.language.iso

en

dc.publisher

University of Edinburgh. College of Science and Engineering. School of Informatics.

en

dc.subject.other

discourse connectives

en

dc.subject.other

lexical acquisition

en

dc.subject.other

machine learning

en

dc.title

Automatic acquisition of knowledge about discourse connectives

en

dc.title.alternative

The automatic acquisition of knowledge about discourse connectives

dc.type

Thesis or Dissertation

en

dc.type.qualificationlevel

Doctoral

en

dc.type.qualificationname

PhD Doctor of Philosophy

en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Hutchinson_thesis.pdf
Size:: 1.14 MB
Format:: Adobe Portable Document Format

Download

This item appears in the following Collection(s)

Informatics thesis and dissertation collection