Global Inference for Sentence Compression: An Integer Linear Programming Approach
View/ Open
Clarke J thesis 08.zip (1.268Mb)
Date
2008Author
Clarke, James
Metadata
Abstract
In this thesis we develop models for sentence compression. This text rewriting task
has recently attracted a lot of attention due to its relevance for applications (e.g., summarisation)
and simple formulation by means of word deletion. Previous models for
sentence compression have been inherently local and thus fail to capture the long range
dependencies and complex interactions involved in text rewriting. We present a solution
by framing the task as an optimisation problem with local and global constraints
and recast existing compression models into this framework. Using the constraints we
instil syntactic, semantic and discourse knowledge the models otherwise fail to capture.
We show that the addition of constraints allow relatively simple local models to
reach state-of-the-art performance for sentence compression.
The thesis provides a detailed study of sentence compression and its models. The
differences between automatic and manually created compression corpora are assessed
along with how compression varies across written and spoken text. We also discuss
various techniques for automatically and manually evaluating compression output
against a gold standard. Models are reviewed based on their assumptions, training requirements,
and scalability.
We introduce a general method for extending previous approaches to allow for
more global models. This is achieved through the optimisation framework of Integer
Linear Programming (ILP). We reformulate three compression models: an unsupervised
model, a semi-supervised model and a fully supervised model as ILP problems
and augment them with constraints. These constraints are intuitive for the compression
task and are both syntactically and semantically motivated. We demonstrate how they
improve compression quality and reduce the requirements on training material.
Finally, we delve into document compression where the task is to compress every
sentence of a document and use the resulting summary as a replacement for the
original document. For document-based compression we investigate discourse information
and its application to the compression task. Two discourse theories, Centering
and lexical chains, are used to automatically annotate documents. These annotations
are then used in our compression framework to impose additional constraints on the
resulting document. The goal is to preserve the discourse structure of the original document
and most of its content. We show how a discourse informed compression model
can outperform a discourse agnostic state-of-the-art model using a question answering
evaluation paradigm.