Controlling context factors in abstractive summarization of long documents
Item Status
Embargo End Date
Date
Authors
Fonseca, Marcio
Abstract
The massive influx of textual data poses a significant challenge in technical fields, fueling the research of text summarization systems. Through innovative approaches in representation learning and extensive data utilization, these systems have demonstrated remarkable advancements, particularly within domain-specific contexts. More recently, large language models (LLMs) such as ChatGPT demonstrated an impressive ability to generate abstractive summaries that are fluent and relevant according to human judgments, even without domain-specific training. While those models are regarded as strong general-purpose summarizers, technical documents require more nuanced control of contextual factors that depend on the target audience and task goals.
In this thesis, we argue that integrating contextual factors that are not easily distilled from reference summaries is crucial for advancing in summarization of long technical documents. We establish a conceptual framework separating intrinsic factors that can be determined from document-summary pairs (e.g., redundancy and relevance) and extrinsic factors (e.g., conciseness and rhetoric) that depend on the task context and subjective intentionality. Guided by this framework, we approach the summarization problem as a factorized energy-based model, in which we optimize for intrinsic and extrinsic factors separately. Our model, FactorSum, achieves significant improvements in terms of lexical alignment to reference summaries while requiring modest compute resources compared to baselines.
Furthermore, we delve into the application of large language models to three types of scientific summarization tasks: abstract generation, summarization for reviews, and lay summarization. Our results show that those LLMs excel at the controllability of stylistic features such as budget and narrative perspective. However, these models exhibit gaps in the understanding of domain concepts in scientific papers, which limits more fine-grained control. Finally, we also propose an approach to improve the lexical alignment of summaries guiding LLM summarizers with keywords derived from FactorSum, thus combining the strengths of both approaches.
In conclusion, our investigation confirms that large language models are powerful tools for summarization tasks, occasionally eclipsing human-authored summaries according to expert judgments. However, we find that LLMs struggle to match the richness of human perspectives in lay summarization, for instance. Our factorized modeling approach partially addresses these limitations, and hopefully, inspires future work focusing on context-aware summarization.
This item appears in the following Collection(s)

