Understanding and generating language with discourse representation structures
dc.contributor.advisor
Cohen, Shay
dc.contributor.advisor
Lapata, Maria
dc.contributor.advisor
Lascarides, Alexandr a
dc.contributor.author
Liu, Jiangming
dc.date.accessioned
2021-08-19T16:07:07Z
dc.date.available
2021-08-19T16:07:07Z
dc.date.issued
2021-07-31
dc.description.abstract
Natural Language is a way for humans to understand what is happening in the world. However, machines with intelligence prefer symbolic representations that explicitly represent linguistic information of utterances. That is why fundamental natural language processing tasks are necessary. Several symbolic formalisms have been proposed to represent the meaning of natural language. Unlike other symbolic formalisms, Discourse Representation Theory (DRT) is model-theoretic, interpretable and was proposed with the intention to capture more linguistic phenomena such as scope, quantification, and presupposition within and across sentences. Also, the recent development of resources for DRT allows developing tools based on this formalism on a larger scale compared to previous attempts, which were mostly relied on hand engineering. Thesis explores two natural language processing problems relating to Discourse Representation Structure (DRS), namely parsing and generation. We address several questions (1) how to obtain the meaning representation of natural language with arbitrary lengths
based on discourse representation theory; (2) how to transform discourse representations in their box-oriented form to computational formats that are easy to model; (3) how to design neural models to automatically generate discourse representation structures from natural language and vice versa; (4) how to adopt annotations of varying quality to improve our models and move to low-resource languages analysis.
We discuss discourse representation theory in the context of related meaning representations and show how DRT deals with various linguistics phenomena, such as predicate-argument structure, word senses, scope and quantification, presupposition, temporal expressions, anaphoric coreference and rhetorical relations. By comparing DRT to related meaning representations, we show why it is important to develop tools based on DRT and why it is better than other formalism.
A computational format is necessary to model discourse representation structures. We provide the definition of Discourse Representation Tree Structures (DRTS) that are derived from discourse representation boxes. We propose a neural DRTS parser with a hierarchical encoder and a 3-step decoder. Furthermore, we improve upon DRTS by introducing a lossless transformation algorithm that allows us to deal with presuppositions and senses. We adopt the Transformer as our DRS parser and compare DRS parsing in tree vs clause format.
In order to explore discourse representation analysis in multiple languages, we
propose Universal Discourse Representation Structure (UDRS) that allows to bridge semantic symbols with the pre-trained language models and to be portable to global knowledge bases (e.g., GermaNet for German and HowNet for Chinese) instead of only English knowledge bases. It raises the problem of low-resource language analysis. In the monolingual analysis scenario, we propose an iterative learning algorithm that can adopt varying quality annotations. In the cross-lingual analysis scenario, we propose one-to-many approach which translates gold standard English to non-English text and trains multiple models (one per language) on the translations, and many-to-one approach that translates non-English text to English, and then runs a relatively accurate English model on the translated text methods. These two methods significantly
improve DRS parsing in low-resource languages.
We also introduce a general neural framework for DRS-to-text generation that maps DRSs to natural language. Our generator is based on an encoder-decoder architecture equipped with a novel TreeLSTM model. Based on our success with DRS parsing and generation, we empirically study neural interlingual machine translation that first parses the source language to DRSs, and then generates the target language from the DRSs. Although it cannot reach the commercial machine translation systems (e.g., Google Translate) which are trained on billions of data, our neural interlingual machine translation system outperforms the competitive baselines.
Taken together, this thesis explores understanding and generating natural language with discourse representation structures by investigating semantic formalisms (i.e., discourse representation structures and universal discourse representation structures), designing neural models for the monolingual and the cross-lingual semantic parsing and natural language generation, and studying DRS-interlingual machine translation. Our experiments on Groningen Meaning Bank and Parallel Meaning Bank show the successes of neural discourse representation structure parsing and generation and shed light on natural language understanding and generation for natural language processing tasks (e.g., DRS-interlingual machine translation).
en
dc.identifier.uri
https://hdl.handle.net/1842/37936
dc.identifier.uri
http://dx.doi.org/10.7488/era/1211
dc.language.iso
en
en
dc.publisher
The University of Edinburgh
en
dc.title
Understanding and generating language with discourse representation structures
en
dc.type
Thesis or Dissertation
en
dc.type.qualificationlevel
Doctoral
en
dc.type.qualificationname
PhD Doctor of Philosophy
en
Files
Original bundle
1 - 1 of 1
- Name:
- LiuJ__2021.pdf
- Size:
- 2.03 MB
- Format:
- Adobe Portable Document Format
- Description:
This item appears in the following Collection(s)

