Moving beyond parallel data for neural machine translation

Currey, Anna

Moving beyond parallel data for neural machine translation

Simple item page

dc.contributor.advisor

Heafield, Kenneth

en

dc.contributor.advisor

Renals, Stephen

en

dc.contributor.author

Currey, Anna

en

dc.contributor.sponsor

European Research Council

en

dc.date.accessioned

2019-09-24T09:43:03Z

dc.date.available

2019-09-24T09:43:03Z

dc.date.issued

2019-11-23

dc.description.abstract

The goal of neural machine translation (NMT) is to build an end-to-end system that automatically translates sentences from the source language to the target language. Neural machine translation has become the dominant paradigm in machine translation in recent years, showing strong improvements over prior statistical methods in many scenarios. However, neural machine translation relies heavily on parallel corpora for training; even for two languages with abundant monolingual resources (or with a large number of speakers), such parallel corpora may be scarce. Thus, it is important to develop methods for leveraging additional types of data in NMT training. This thesis explores ways of augmenting the parallel training data of neural machine translation with non-parallel sources of data. We concentrate on two main types of additional data: monolingual corpora and structural annotations. First, we propose a method for adding target-language monolingual data into neural machine translation in which the monolingual data is converted to parallel data through copying. Thus, the NMT system is trained on two tasks: translation from source language to target language, and autoencoding the target language. We show that this model achieves improvements in BLEU score for low- and medium-resource setups. Second, we consider the task of zero-resource NMT, where no source ↔ target parallel training data is available, but parallel data with a pivot language is abundant. We improve these models by adding a monolingual corpus in the pivot language, translating this corpus into both the source and the target language to create a pseudo-parallel source-target corpus. In the second half of this thesis, we turn our attention to syntax, introducing methods for adding syntactic annotation of the source language into neural machine translation. In particular, our multi-source model, which leverages an additional encoder to inject syntax into the NMT model, results in strong improvements over non-syntactic NMT for a high-resource translation case, while remaining robust to unparsed inputs. We also introduce a multi-task model that augments the transformer architecture with syntax; this model improves translation across several language pairs. Finally, we consider the case where no syntactic annotations are available (such as when translating from very low-resource languages). We introduce an unsupervised hierarchical encoder that induces a tree structure over the source sentences based solely on the downstream task of translation. Although the resulting hierarchies do not resemble traditional syntax, the model shows large improvements in BLEU for low-resource NMT.

en

dc.identifier.uri

http://hdl.handle.net/1842/36195

dc.language.iso

en

dc.publisher

The University of Edinburgh

en

dc.relation.hasversion

Currey, A. and Heafield, K. (2018). Multi-source syntactic neural machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2961–2966. Association for Computational Linguistics.

en

dc.relation.hasversion

Currey, A. and Heafield, K. (2018). Unsupervised source hierarchies for low-resource neural machine translation. In Proceedings of the Workshop on the Relevance of Linguistic Structure in Neural Architectures for NLP, pages 6–12. Association for Computational Linguistics.

en

dc.relation.hasversion

Currey, A. and Heafield, K. (2019). Incorporating source syntax into transformer-based neural machine translation. In Proceedings of the Fourth Conference on Machine Translation. Association for Computational Linguistics.

en

dc.relation.hasversion

Currey, A., Miceli Barone, A. V., and Heafield, K. (2017). Copied monolingual data improves low-resource neural machine translation. In Proceedings of the Second Conference on Machine Translation, pages 148–156. Association for Computational Linguistics.

en

dc.relation.hasversion

Sennrich, R., Birch, A., Currey, A., Germann, U., Haddow, B., Heafield, K., Miceli Barone, A. V., and Williams, P. (2017). The University of Edinburgh’s neural MT systems for WMT17. In Proceedings of the Second Conference on Machine Translation, pages 389–399. Association for Computational Linguistics.

en

dc.subject

machine translation

en

dc.subject

automatically translating

en

dc.subject

pivot language

en

dc.subject

monolingual corpora

en

dc.subject

structural annotations

en

dc.subject

neural machine translation

en

dc.title

Moving beyond parallel data for neural machine translation

en

dc.type

Thesis or Dissertation

en

dc.type.qualificationlevel

Doctoral

en

dc.type.qualificationname

PhD Doctor of Philosophy

en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Currey2019.pdf
Size:: 594.11 KB
Format:: Adobe Portable Document Format

Download

This item appears in the following Collection(s)

Informatics thesis and dissertation collection