Learning weakly structured representations for text-to-text generation

Hosking, Tom

Learning weakly structured representations for text-to-text generation

Simple item page

dc.contributor.advisor

Lapata, Mirella

dc.contributor.advisor

Tang, Hao

dc.contributor.author

Hosking, Tom

dc.date.accessioned

2025-02-21T10:06:16Z

dc.date.available

2025-02-21T10:06:16Z

dc.date.issued

2025-02-21

dc.description.abstract

Text-to-text generation refers to a class of problems that involve transforming one piece of text to another, such as paraphrase generation, summarisation and automatic translation. Deep learning approaches to text-to-text generation first map a natural language utterance to some learned representation, perform some processing within this representation space, then map the modified representation back to natural language. Currently, the majority of such models use an unstructured sequence of dense vector embeddings that is fully learned from data as the representation. This data-driven approach has proven successful and requires little guidance from a model designer, but the resulting representations are not easily interpretable and do not exploit known properties of the task under consideration (e.g., for paraphrase generation, the meaning and form of an input sentence should be treated separately). In this thesis, we hypothesise that choosing a weakly structured representation is a better approach. The structure should encode the aspects of the tasks that are known, but remain sufficiently flexible that the unknown aspects may be learned. We argue that discrete and hierarchical representations make some aspects of text-to-text generation tasks more feasible, enabling models that are attributable and scale to longer inputs. Finally, we hypothesise that structure alone is not sufficent, and that some degree of supervision is needed to assign meaning to a structured representation. We focus on two text-to-text generation tasks to gather support for these hypotheses: paraphrase generation, where a model must generate an output sentence with the same meaning but different surface form to a given input sentence; and opinion summarisation, which involves generating a textual summary that aggregates popular opinions from customer reviews about hotels or other products. We begin by proposing a model for paraphrase generation that represents the meaning and surface form of an input separately, with the surface form represented as a set of discrete codes learned through Vector Quantisation (VQ-VAE). We show that this weakly structured choice of representation enables us to generate high quality paraphrases by keeping the semantic representation constant and varying the syntactic representation, supporting our first hypothesis. We use a denoising objective based on distant supervision to induce the separation between representations. Next, we address the lack of a tractable factorisation in VQ-VAE, and introduce Hierarchical Residual Quantisation (HRQ-VAE), a method for learning hierarchical discrete representations of input data, and show that it learns more informative representations than VQ-VAE. We then combine the hierarchical representations of HRQ-VAE with separated encoding spaces for paraphrase generation, showing that the more richly structured choice of representation leads to improved quality of generated paraphrases. To demonstrate that HRQ-VAE can be beneficial for more complex text-to-text tasks, we apply it to opinion summarisation, representing sentences from customer reviews as paths through a learned hierarchy. We show that we can generate informative summaries of these reviews that are attributable and scale to large numbers of reviews, by identifying which paths in the hierarchy are frequently attested across each set of reviews. Finally, we combine the scalability and attributability of hierarchical representations with the fluency and coherence of Large Language Models, and use an encoder based on HRQ-VAE to build a hierarchical index over review sentences that may then be used to retrieve clusters of sentences containing popular opinions. We use distant supervision based on entailment relations to induce a semantic ordering to the learned hierarchy and show that the hierarchy directly enables the scalability and attributability of our model. Overall, our experiments act as support in favour of our hypotheses that weakly structured representations are beneficial for text-to-text generation, that discrete and hierarchical representations are a powerful choice of structure, and that distant supervision is needed to assign meaning to the structures.

en

dc.identifier.uri

https://hdl.handle.net/1842/43131

dc.identifier.uri

http://dx.doi.org/10.7488/era/5673

dc.language.iso

en

dc.publisher

The University of Edinburgh

en

dc.relation.hasversion

Hosking, T., Blunsom, P., and Bartolo, M. (2023a). Human feedback is not gold standard

en

dc.relation.hasversion

Hosking, T. and Lapata, M. (2021). Factorising meaning and form for intent-preserving paraphrasing. In Zong, C., Xia, F., Li,W., and Navigli, R., editors, Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1405–1418, Online. Association for Computational Linguistics

en

dc.relation.hasversion

Hosking, T. and Riedel, S. (2019). Evaluating rewards for question generation models. In Burstein, J., Doran, C., and Solorio, T., editors, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 2278– 2283, Minneapolis, Minnesota. Association for Computational Linguistics

en

dc.relation.hasversion

Hosking, T., Tang, H., and Lapata, M. (2022). Hierarchical sketch induction for paraphrase generation. In Muresan, S., Nakov, P., and Villavicencio, A., editors, Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2489–2501, Dublin, Ireland. Association for Computational Linguistics

en

dc.relation.hasversion

Hosking, T., Tang, H., and Lapata, M. (2023b). Attributable and scalable opinion summarization. In Rogers, A., Boyd-Graber, J., and Okazaki, N., editors, Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8488–8505, Toronto, Canada. Association for Computational Linguistics

en

dc.relation.hasversion

Hosking, T., Tang, H., and Lapata, M. (2024). Hierarchical indexing for retrievalaugmented opinion summarization. Transactions of the Association for Computational Linguistics, 12:1533–1555

en

dc.relation.hasversion

Sherborne, T., Hosking, T., and Lapata, M. (2023). Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing. Transactions of the Association for Computational Linguistics, 11:1432–1450

en

dc.subject

text-to-text generation

en

dc.subject

Deep learning

en

dc.subject

natural language

en

dc.subject

weakly structured representation

en

dc.subject

Vector Quantisation (VQ-VAE)

en

dc.subject

Hierarchical Residual Quantisation (HRQ-VAE)

en

dc.subject

Large Language Models

en

dc.title

Learning weakly structured representations for text-to-text generation

en

dc.type

Thesis or Dissertation

en

dc.type.qualificationlevel

Doctoral

en

dc.type.qualificationname

PhD Doctor of Philosophy

en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: HoskingT_2025.pdf
Size:: 4.5 MB
Format:: Adobe Portable Document Format
Description:

Download

This item appears in the following Collection(s)

Informatics thesis and dissertation collection