Edinburgh Research Archive

Explicit discourse modelling for coreference and summarization

Abstract

Understanding and responding to natural language requires a level of representation for the input text. When reading about a character in a novel, we may remember them by attributes such as their name, events they are involved in, or their relationships with others. Many modern approaches choose a straightforward strategy: they simply store the entire input document in their context window. While this approach has merit, it becomes apparent with longer documents that storing the entire input text in context may be computationally difficult and wasteful. In cases where it is feasible, it may still impede performance, as the document’s length may hinder the model’s ability to focus on relevant aspects of the input. This thesis investigates whether more careful text representations are suitable for two discourse-level tasks: coreference resolution and text summarization. We are inter ested in maintaining an explicit representation of the discourse, which compresses the input text into a more efficient representation. Our interest in efficient representations also leads us to propose incremental models. This process mimics human language pro cessing, where text is consumed incrementally instead of simultaneously. Incremental models are also crucial for downstream applications that require incrementality, such as in dialogue interaction. This thesis argues that explicit discourse representations can lead to more efficient processing, better performance, or both. First, we propose an incremental, memory based mechanism for the coreference resolution task. The system processes text sentence-by-sentence, storing encountered mentions as partial coreference clusters in a memory matrix. In an incremental setting, we show that our proposed surpasses contemporary baselines when they are constrained to an incremental setting. Second, we consider a generative, seq2seq paradigm for coreference resolution. In stead of holding the entire document in context, we propose a compressed, model-based discourse representation. Our proposed method truncates the context to its mentions and organizes them into entity representations. We show that this representation maintains similar performance to a naively incremental system, while discarding a majority of the document’s context. In the case where singleton mentions are included in the data, our compressed representation surpasses state-of-the-art performance in a more efficient manner. Our last task considers discourse modelling in a narrative summarization task. Here, we investigate a plan-based approach, where the generated summary is grounded in a high-level plan of summary content. We find that although summaries are well grounded to their plans, they are no more faithful to the source document than non planning baselines. Human evaluation shows generated plans contain an equal amount of hallucinated content as the summary, leading to summaries that grounded but unfaithful. When we replace these plans with powerful, LLM-generated ones, summary quality improves dramatically. The result emphasizes the importance of high-quality plans in planning-based approaches to summarization.

This item appears in the following Collection(s)