Cognitive structures of content for controlled summarization

Cardenas Acosta, Ronald

Cognitive structures of content for controlled summarization

Files

Cardenas AcostaR_2024.pdf (2.69 MB)

Date

2024-03-27

Authors

Cardenas Acosta, Ronald

Full item page

Abstract

In the current information age, where over 1 Petabyte of data is created every day on the web, demand continues to rise for effective technological tools to aid end-users in consuming information in a timely way. Automatic summarization is the task of consuming a text document –or collection of documents-- and presenting the user with a shorter text, the \textit{summary}, that retains the gist of the information consumed. In general, a good summary should present content bits that are relevant –be informative--, non-redundant -be non-repetitive--, organized in a sensical way –be coherent--, and read as a unified thematic whole –be cohesive. The particular information needs of each user prompted many variations of the summarization task. Among them, extractive summarization consists of extracting spans of text -usually sentences- from the input document(s), concatenating them, and presenting them as the final summary. Traditionally, extractive systems focus their attention on presenting highly informative content, regardless of whether content bits are repeated or presented in an incoherent, non-cohesive manner. How to balance these properties remains an understudied problem, even though the understanding of the trade-offs between them could enable a system to produce text with relevant content that is also more readable to humans. This thesis argues that extractive summaries can be presented in a non-redundant, cohesive way, and still be informative. We investigate the interaction between these summary properties and develop models that balance their trade-off during document understanding and during summary production. At the core of these models, an algorithm --inspired by psycholinguistic models of memory-- simulates how humans keep track of relevant content in short-term memory, and how cohesion and non-redundancy constraints are applied among content bits in memory. The results are encouraging. When modeling trade-off during document understanding in an unsupervised scenario, we find that our models are able to detect relevant content, reduce redundancy, and significantly improve cohesion in summaries, especially when the input document exhibits high redundancy. Furthermore, we show that this balance can be controlled through specific, interpretable hyper-parameters. In a similar reinforcement learning scenario, we find that informativeness and cohesion can influence each other positively. Finally, when modeling trade-off during summary extraction, our models are able to better enforce cohesive ties between semantically similar text spans in neighboring sentences. Our approach produces summaries that are perceived by humans as more cohesive and as informative as summaries only built for informativeness. Catering to the need to process extremely long and redundant input, we design this system to be capable of consuming sequences of text of arbitrary length and test it on scenarios with single, long documents, and multi-documents.

URI

https://hdl.handle.net/1842/41676
http://dx.doi.org/10.7488/era/4399

This item appears in the following Collection(s)

Informatics thesis and dissertation collection