Neural semantic role labeling with more or less supervision
dc.contributor.advisor
Lapata, Maria
dc.contributor.advisor
Sutton, Charles
dc.contributor.author
Cai, Rui
dc.date.accessioned
2021-11-12T12:05:39Z
dc.date.available
2021-11-12T12:05:39Z
dc.date.issued
2021-11-30
dc.description.abstract
In recent years, thanks to the relative maturity of neural network models, the task of
automatically identifying and labeling the semantic roles has been the focus of renewed interest. These models have the capacity to learn continuous representations
automatically and thereby forgo the need for extensive feature engineering. Semantic
role labeling (SRL) has generally been recognized as a core task in natural language
processing (NLP) and has been shown to benefit a range of NLP applications such as
machine translation, information extraction and summarization.
Recent SRL systems have usually been trained on datasets whose semantic role
annotations have been produced on the top of tree-banked corpora. This reflects the
intimate relationship between syntactic information and semantic roles. In order to
effectively incorporate syntactic information into neural network models, we train the
semantic role labeler jointly with two auxiliary tasks: predicting the dependency label
of a word, and determining whether there exists an arc linking it to the predicate. The
auxiliary tasks provide syntactic information that is specific to SRL and can be learnt
from training data (dependency annotations). This liberates our SRL system from the
dependence on external parsers, which is believed to be noisy (e.g., on out-of-domain
data or infrequent constructions).
Supervised neural SRL models, which derive their efficacy via sufficient annotated
data, are driven by data. Nonetheless, the reliance on high-quality annotations obscures the development of SRL systems in low-resource scenarios (e.g., rare languages
or domains). In order to reduce the annotation effort involved, we have rendered semi-supervised learning for SRL as simple as possible. More specifically, we propose an end-to-end SRL model and demonstrate it could effectively leverage unlabeled data
within the cross-view training modeling paradigm. Our semantic role labeler is jointly
trained with auxiliary tasks subsidiary to SRL. Consequently, our system may be applied directly to plain text, and it is essentially self-sufficient.
For true low-resource languages, we cannot even expect to perform semi-supervised
learning for them, as SRL annotations are only available for a handful of the world’s
languages. To build a competitive semantic role labeler for these low-resource languages, we have resorted to cross-lingual semantic role labeling, which can transfer supervision in a source language to target languages (low-resource languages). The
backbone of our model is an LSTM-based semantic role labeler jointly trained with a
semantic role compressor and multilingual word embeddings. The compressor collects
useful information from the output of semantic role labeler and compress it into fixed-size cross-lingual representations. Our model (in contrast to earlier efforts, which deployed automatic alignments in order to transfer annotations) exists in a space of mul-tilingual embeddings. For the target language, moreover, it affords direct supervision
for the prediction of semantic roles. For model evaluation, we have also contributed
two quality-controlled datasets, which we hope will be useful for the development of
cross-lingual models.
en
dc.identifier.uri
https://hdl.handle.net/1842/38272
dc.identifier.uri
http://dx.doi.org/10.7488/era/1538
dc.language.iso
en
en
dc.publisher
The University of Edinburgh
en
dc.relation.hasversion
Cai, R., X. Zhang, and H. Wang (2016). “Bidirectional Recurrent Convolutional Neural Network for Relation Classification”. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Berlin, Germany, pp. 756– 765
en
dc.relation.hasversion
Cai, R. and M. Lapata (Nov. 2019a). “Semi-Supervised Semantic Role Labeling with Cross View Training”. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong, China: Association for Computational Linguistics, pp. 1018–1027. DOI: 10.18653/v1/D19-1094
en
dc.relation.hasversion
Cai, R. and M. Lapata (Mar. 2019b). “Syntax-aware Semantic Role Labeling without Parsing”. In: Transactions of the Association for Computational Linguistics 7, pp. 343–356. DOI: 10.1162/tacl_a_00272
en
dc.relation.hasversion
Cai, R. and M. Lapata (Nov. 2020). “Alignment-free Cross-lingual Semantic Role Labeling”. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Online: Association for Computational Linguistics, pp. 3883–3894. DOI: 10.18653/v1/2020.emnlp-main.319.
en
dc.subject
semantic role labeling
en
dc.subject
natural language processing
en
dc.subject
tree-banked corpora
en
dc.subject
supervised neural SRL models
en
dc.subject
LSTM-based semantic role labeler
en
dc.title
Neural semantic role labeling with more or less supervision
en
dc.type
Thesis or Dissertation
en
dc.type.qualificationlevel
Doctoral
en
dc.type.qualificationname
PhD Doctor of Philosophy
en
Files
Original bundle
1 - 1 of 1
- Name:
- Cai2021.pdf
- Size:
- 4.11 MB
- Format:
- Adobe Portable Document Format
- Description:
This item appears in the following Collection(s)

