Neural document modeling and summarization

Liu, Yang

Neural document modeling and summarization

Simple item page

dc.contributor.advisor

Lapata, Maria

en

dc.contributor.advisor

Titov, Ivan

en

dc.contributor.author

Liu, Yang

en

dc.contributor.sponsor

other

en

dc.date.accessioned

2020-05-15T15:23:42Z

dc.date.available

2020-05-15T15:23:42Z

dc.date.issued

2020-06-25

dc.description.abstract

Document summarization is the task of automatically generating a shorter version of a document or multiple documents while retaining the most important information. The task has received much attention in the natural language processing community due to its potential for various information access applications. Examples include tools that digest textual content (e.g., news, social media, reviews), answer questions, or provide recommendations. Summarization approaches are dedicated to processing single or multiple documents as well as creating extractive or abstractive summaries. In extractive summarization, summaries are formed by copying and concatenating the most important spans (usually sentences) from the input text, while abstractive approaches are able to generate summaries using words or phrases that are not in the original text. A core module within summarization is how to represent documents and distill information for downstream tasks (e.g., abstraction or extraction). Thanks to the popularity of neural network models and their ability to learn continuous representations, many new systems have been proposed for document modeling and summarization in recent years. This thesis investigates different approaches with neural network models to address the document summarization problem. We develop several novel neural models considering extractive and abstractive approaches for both single-document and multi-document scenarios. We first investigate how to represent a single document with a randomly initialized neural network. Contrary to previous approaches that ignore document structure when encoding the input, we propose a structured attention mechanism, which can impose a structural bias of document-level dependency trees when modeling a document, generating more powerful document representations. We first apply this model to the task of document classification, and subsequently to extractive single-document summarization using an iterative refinement process to learn more complex tree structures. Experimental results on both tasks show that the structured attention mechanism achieves competitive performance. Very recently, pretrained language models have achieved great success on several natural language understanding tasks by training large neural models on an enormous corpus with a language modeling objective. These models learn rich contextual information and to some extent are able to learn the structure of the input text. While summarization systems could in theory also benefit from pretrained language models, there are some potential obstacles to applying these pretrained models to document summarization tasks. The second part of this thesis focuses on how to represent a single document with pretrained language models. Beyond previous approaches that learn solely from the summarization dataset, this thesis proposes a framework for using pretrained language models as encoders for both extractive and abstractive summarization. The framework achieves state-of-the-art results on three datasets. Finally, in the third part of this thesis, we move beyond single documents and explore approaches for using neural networks for summarizing multiple documents. We analyze why the application of existing neural summarization models to this task is challenging and develop a novel modeling framework. More concretely, we propose a ranking-based pipeline and a hierarchical neural encoder for processing multiple input documents. Experiments on a large-scale multi-document summarization dataset, show that our system can achieve promising performance.

en

dc.identifier.uri

https://hdl.handle.net/1842/37048

dc.identifier.uri

http://dx.doi.org/10.7488/era/349

dc.language.iso

en

dc.publisher

The University of Edinburgh

en

dc.relation.hasversion

Liu, Y. and Lapata, M. (2017). Learning contextually informed representations for linear-time discourse parsing. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1289–1298, Copenhagen, Denmark.

en

dc.relation.hasversion

Liu, Y. and Lapata, M. (2018). Learning structured text representations. Transactions of the Association for Computational Linguistics, 6:63–75

en

dc.relation.hasversion

Liu, Y., Titov, I., and Lapata, M. (2019). Single document summarization as tree induction. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 1745–1755, Minneapolis, Minnesota

en

dc.subject

text summarization

en

dc.subject

neural document modeling

en

dc.title

Neural document modeling and summarization

en

dc.type

Thesis or Dissertation

en

dc.type.qualificationlevel

Doctoral

en

dc.type.qualificationname

PhD Doctor of Philosophy

en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Liu2020.pdf
Size:: 1.61 MB
Format:: Adobe Portable Document Format
Description:

Download

This item appears in the following Collection(s)

Informatics thesis and dissertation collection