Abstractive summarization of long narratives through content selection and model scaling
dc.contributor.advisor
Keller, Frank
en
dc.contributor.advisor
Tang, Hao
en
dc.contributor.advisor
Minervini, asquale
en
dc.contributor.author
Saxena, Rohit
en
dc.date.accessioned
2026-01-23T15:41:52Z
en
dc.date.issued
2025-12-02
en
dc.description.abstract
Abstractive summarization of long narrative texts, such as novels and movie screenplays, presents significant challenges due to their extensive length, complex structure, and the necessity of capturing essential narrative elements accurately. While large language models (LLMs) have demonstrated remarkable progress in text summarization, their ability to process long narratives remains limited due to computational constraints and the difficulty of extracting salient elements. This thesis addresses these challenges, particularly in movie screenplays, by focusing on two key issues: identifying salient scenes crucial to the overall story and handling the computational constraints inherent in processing lengthy texts.
First, we introduce MovieSum, a large-scale dataset specifically designed for movie screenplay summarization. MovieSum consists of 2,200 movie screenplays paired with their corresponding Wikipedia plot summaries and is manually formatted to represent structural screenplay elements. This dataset is significantly larger than existing resources and includes metadata such as IMDb IDs, enabling access to additional knowledge. We use MovieSum to benchmark recent LLMs and provide a baseline for future research in narrative summarization. Additionally, we introduce the Movie Scene Saliency Dataset (MENSA), a subset of MovieSum containing human-annotated salient scenes from 100 diverse movies.
Second, we investigate the role of scene saliency and saliency-based content selection in screenplay summarization. A movie screenplay consists of numerous scenes, but only a fraction of them contribute meaningfully to the overall story. We propose a two-stage summarization framework utilizing the MENSA dataset. The first stage identifies key scenes based on their relevance to the movie’s narrative, while the second stage generates an abstractive summary using only these salient scenes. Our findings show that this approach outperforms existing state-of-the-art summarization methods, producing summaries that more accurately reflect the content of the narrative.
Finally, we address the computational limitations of transformer-based models in processing long documents, including movie screenplays. Existing models rely on truncation, which leads to information loss and inconsistencies between training and inference. To mitigate this, we propose CachED, a gradient caching technique that enables end-to-end training of encoder-decoder models on full-length documents without truncation. We apply CachED to extend BART, creating CachED-BART, which is capable of backpropagation on nearly one million tokens without additional parameter overhead. Experimental results demonstrate that CachED-BART achieves superior performance on long-form summarization tasks, including movie screenplays and books, while maintaining efficiency and scalability.
This thesis advances the field of long-form narrative summarization by introducing structured datasets, scene-aware summarization techniques, and novel training methodologies. Our results highlight the importance of selecting salient narrative elements and leveraging efficient model architectures to generate accurate and coherent summaries for complex, lengthy texts such as movie screenplays. Through these contributions, this thesis aims to enhance the efficiency and accuracy of movie script summarization, while also providing valuable insights into overcoming the computational challenges associated with long-form narrative summarization.
en
dc.identifier.uri
https://era.ed.ac.uk/handle/1842/44341
dc.identifier.uri
https://doi.org/10.7488/era/6861
dc.language.iso
en
en
dc.publisher
The University of Edinburgh
en
dc.relation.hasversion
Rohit Saxena and Frank Keller. 2024. Select and Summarize: Scene Saliency for Movie Script Summarization. In Findings of the Association for Computational Linguistics: NAACL 2024, pages 3439–3455, Mexico City, Mexico. Association for Computational Linguistics
en
dc.relation.hasversion
Rohit Saxena and Frank Keller. 2024. MovieSum: An Abstractive Summarization Dataset for Movie Screenplays. In Findings of the Association for Computational Linguistics: ACL 2024, pages 4043–4050, Bangkok, Thailand. Association for Computational Linguistics.
en
dc.relation.hasversion
Rohit Saxena, Hao Tang, and Frank Keller. 2025. End-to-End Long Document Summarization using Gradient Caching. In Transactions of the Association for Computational Linguistics (TACL)
en
dc.relation.hasversion
Rohit Saxena, Pasquale Minervini, and Frank Keller. 2025. PosterSum: A Multimodal Benchmark for Scientific Poster Summarization. NeurIPS 2025 Workshop on Evaluating the Evolving LLM Lifecycle: Benchmarks, Emergent Abilities, and Scaling
en
dc.relation.hasversion
Rohit Saxena, Aryo Pradipta Gema, Pasquale Minervini. 2025. Lost in Time: Clock and Calendar Understanding Challenges in Multimodal LLMs. Reasoning and Planning for LLMs Workshop at The Thirteenth International Conference on Learning Representations
en
dc.relation.hasversion
Saxena, R., Bhat, S., and Pedanekar, N. (2017). Live on TV, alive on Twitter: Quantifying continuous partial attention of viewers during live television telecasts. In 2017 IEEE International Conference on Data Mining Workshops (ICDMW), pages 1042–1049.
en
dc.subject
abstractive summarization
en
dc.subject
long narrative summaries
en
dc.subject
MovieSum
en
dc.subject
screenplay dataset
en
dc.subject
human-annotated scenes
en
dc.subject
Gradient Caching for Encoder-Decoder models
en
dc.subject
CachED
en
dc.subject
scalable model training
en
dc.subject
content selection
en
dc.title
Abstractive summarization of long narratives through content selection and model scaling
en
dc.type
Thesis or Dissertation
en
dc.type.qualificationlevel
Doctoral
en
dc.type.qualificationname
PhD Doctor of Philosophy
en
Files
Original bundle
1 - 1 of 1
- Name:
- Saxena2025.pdf
- Size:
- 1.76 MB
- Format:
- Adobe Portable Document Format
- Description:
This item appears in the following Collection(s)

