Modelling cross-lingual transfer for semantic parsing
dc.contributor.advisor
Lapata, Mirella
dc.contributor.advisor
Steedman, Mark
dc.contributor.author
Sherborne, Thomas Rishi
dc.date.accessioned
2024-09-18T13:10:32Z
dc.date.available
2024-09-18T13:10:32Z
dc.date.issued
2024-09-18
dc.description.abstract
Semantic parsing maps natural language utterances to logical form
representations of meaning (e.g., lambda calculus or SQL). A semantic parser
functions as a human-computer interface by translating natural language into
machine-readable logic to answer questions or respond to requests. Semantic
parsing is a critical technology within language understanding systems (e.g.,
digital assistants) for accessing computational tools using natural language
without expert knowledge or programming skills.
Cross-lingual semantic parsing adapts a parser to map more natural languages to
logical form. Contemporary advances in semantic parsing generally only study
parsing of English. Successful cross-lingual transfer for a semantic parser
improves the utility of parsing technologies by enabling broader access to these
tools. However, developing a cross-lingual semantic parser introduces additional
challenges and trade-offs. High-quality data for new languages is scarce and
requires complex annotation. Given available data, a parser must adapt to
language variations in expressing meaning and intent. Existing multilingual
models and corpora also exhibit extant biases for English, with variable
cross-lingual transfer to languages with fewer speakers or resources. At
present, there is no optimal strategy or modelling solution for teaching a new
language to a semantic parser.
This thesis considers the efficient adaptation of a semantic parser from English
to new languages. We are motivated by a case study of an engineer expanding a
natural language database interface to new customers, seeking accurate parsing
of new languages under a constrained budget for annotation. Overcoming the
development challenges of cross-lingual semantic parsing requires innovation in
model design, optimisation algorithms and strategies for sourcing and sampling
data.
Our overarching hypothesis is that cross-lingual transfer is achievable through
aligning representations between a high-resource language (i.e., English) and
new languages unseen for the task. We propose different strategies for this
alignment, exploiting existing resources such as machine translation,
pre-trained models, data for adjacent tasks, or a few annotated examples in each
new language. We propose different modelling solutions suited to the quantity
and quality of cross-lingual data. First, we propose an ensembled model to
bootstrap a parser from multiple machine-translation sources, improving
robustness by exploiting lower-quality synthetic data. Second, we propose a
zero-shot parser using auxiliary tasks to learn cross-lingual representation
alignment without any training data in new languages. Third, we propose an
efficient meta-learning algorithm optimising cross-lingual transfer during
training with a few labelled examples in new languages. Finally, we propose a
latent variable model explicitly minimising divergence between representations
across languages using Optimal Transport. Our results reveal that accurate
cross-lingual semantic parsing is possible by composing minimal samples of
target language data within models explicitly optimising for accurate parsing
and cross-lingual transfer.
en
dc.identifier.uri
https://hdl.handle.net/1842/42188
dc.identifier.uri
http://dx.doi.org/10.7488/era/4909
dc.language.iso
en
en
dc.publisher
The University of Edinburgh
en
dc.relation.hasversion
Moghe, N., Sherborne, T., Steedman, M., and Birch, A. (2023b). Extrinsic evaluation of machine translation metrics. In Rogers, A., Boyd-Graber, J., and Okazaki, N., editors, Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 13060–13078, Toronto, Canada. Association for Computational Linguistics
en
dc.relation.hasversion
Sherborne, T. and Lapata, M. (2022). Zero-shot cross-lingual semantic parsing. In Muresan, S., Nakov, P., and Villavicencio, A., editors, Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4134–4153, Dublin, Ireland. Association for Computational Linguistics
en
dc.relation.hasversion
Sherborne, T. and Lapata, M. (2023). Meta-learning a cross-lingual manifold for semantic parsing. Transactions of the Association for Computational Linguistics, 11:49–67
en
dc.relation.hasversion
Sherborne, T., Xu, Y., and Lapata, M. (2020). Bootstrapping a crosslingual semantic parser. In Cohn, T., He, Y., and Liu, Y., editors, Findings of the Association for Computational Linguistics: EMNLP 2020, pages 499–517, Online. Association for Computational Linguistics
en
dc.subject
cross-lingual transfer for semantic parsing
en
dc.subject
Semantic parsing
en
dc.subject
Cross-lingual semantic parsing
en
dc.subject
semantic parser from English to new languages
en
dc.subject
cross-lingual transfer
en
dc.subject
Optimal Transport
en
dc.title
Modelling cross-lingual transfer for semantic parsing
en
dc.type
Thesis or Dissertation
en
dc.type.qualificationlevel
Doctoral
en
dc.type.qualificationname
PhD Doctor of Philosophy
en
Files
Original bundle
1 - 1 of 1
- Name:
- SherborneTR_2024.pdf
- Size:
- 6.52 MB
- Format:
- Adobe Portable Document Format
- Description:
This item appears in the following Collection(s)

