Data-to-text generation with neural planning

Puduppully, Ratish Surendran

Data-to-text generation with neural planning

Simple item page

dc.contributor.advisor

Lapata, Maria

dc.contributor.advisor

Lopez, Adam

dc.contributor.author

Puduppully, Ratish Surendran

dc.date.accessioned

2022-04-12T12:55:42Z

dc.date.available

2022-04-12T12:55:42Z

dc.date.issued

2022-04-11

dc.description.abstract

In this thesis, we consider the task of data-to-text generation, which takes non-linguistic structures as input and produces textual output. The inputs can take the form of database tables, spreadsheets, charts, and so on. The main application of data-to-text generation is to present information in a textual format which makes it accessible to a layperson who may otherwise find it problematic to understand numerical figures. The task can also automate routine document generation jobs, thus improving human efficiency. We focus on generating long-form text, i.e., documents with multiple paragraphs. Recent approaches to data-to-text generation have adopted the very successful encoder-decoder architecture or its variants. These models generate fluent (but often imprecise) text and perform quite poorly at selecting appropriate content and ordering it coherently. This thesis focuses on overcoming these issues by integrating content planning with neural models. We hypothesize data-to-text generation will benefit from explicit planning, which manifests itself in (a) micro planning, (b) latent entity planning, and (c) macro planning. Throughout this thesis, we assume the input to our generator are tables (with records) in the sports domain. And the output are summaries describing what happened in the game (e.g., who won/lost, ..., scored, etc.). We first describe our work on integrating fine-grained or micro plans with data-to-text generation. As part of this, we generate a micro plan highlighting which records should be mentioned and in which order, and then generate the document while taking the micro plan into account. We then show how data-to-text generation can benefit from higher level latent entity planning. Here, we make use of entity-specific representations which are dynam ically updated. The text is generated conditioned on entity representations and the records corresponding to the entities by using hierarchical attention at each time step. We then combine planning with the high level organization of entities, events, and their interactions. Such coarse-grained macro plans are learnt from data and given as input to the generator. Finally, we present work on making macro plans latent while incrementally generating a document paragraph by paragraph. We infer latent plans sequentially with a structured variational model while interleaving the steps of planning and generation. Text is generated by conditioning on previous variational decisions and previously generated text. Overall our results show that planning makes data-to-text generation more interpretable, improves the factuality and coherence of the generated documents and re duces redundancy in the output document.

en

dc.identifier.uri

https://hdl.handle.net/1842/38869

dc.identifier.uri

http://dx.doi.org/10.7488/era/2123

dc.language.iso

en

dc.publisher

The University of Edinburgh

en

dc.relation.hasversion

Puduppully, R., Dong, L., and Lapata, M. (2019). Data-to-text generation with content selection and planning. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence, Honolulu, Hawaii.

en

dc.relation.hasversion

Puduppully, R., Dong, L., and Lapata, M. (2019). Data-to-text generation with entity modeling. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2023–2035, Florence, Italy. Association for Computational Linguistics

en

dc.relation.hasversion

Puduppully, R., Fu, Y., and Lapata, M. (2022). Data-to-text generation with variational sequential planning. Transactions of the Association for Computational Linguistics, abs/2202.13756

en

dc.relation.hasversion

Puduppully, R. and Lapata, M. (2021). Data-to-text generation with macro planning. Transactions of the Association for Computational Linguistics, abs/2102.02723.

en

dc.subject

data-to-text generation

en

dc.subject

long-form text

en

dc.subject

latent entity planning

en

dc.subject

redundancy reduction

en

dc.title

Data-to-text generation with neural planning

en

dc.type

Thesis or Dissertation

en

dc.type.qualificationlevel

Doctoral

en

dc.type.qualificationname

PhD Doctor of Philosophy

en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Puduppully2022.pdf
Size:: 2.37 MB
Format:: Adobe Portable Document Format
Description:

Download

This item appears in the following Collection(s)

Informatics thesis and dissertation collection