Show simple item record

dc.contributor.advisorSteedman, Mark
dc.contributor.advisorLapata, Maria
dc.contributor.authorWeber, Sabine
dc.date.accessioned2022-10-04T10:43:44Z
dc.date.available2022-10-04T10:43:44Z
dc.date.issued2022-10-04
dc.identifier.urihttps://hdl.handle.net/1842/39405
dc.identifier.urihttp://dx.doi.org/10.7488/era/2655
dc.description.abstractRecognizing textual entailment is an important prerequisite to many tasks in NLP, e.g. question answering and semantic parsing. Knowing that for example buying a thing entails subsequently owning it is a relation that humans learn by interacting with the world, while machines need other ways to acquire this knowledge. Previous approaches at learning predicate entailment relations from text have focused only on English. In this thesis we present the adaptation of the unsupervised entailment graph building algorithm of Hosseini et al. to German, which can be seen as a study of challenges in language adaptation for this task in general. We create a variety of German tools necessary for this approach and give a detailed account of the challenges faced and the insights gained from them. First, we create a German relation extraction system and compare it against the English system presented by Hosseini et al. Finding that the typing of German entities constitutes a bottleneck, we create German fine-grained typing system for named and general entities. In doing so we examine the methods of annotation projection and zero-shot cross-lingual transfer, finding that for German fine-grained named entity typing zero-shot cross-lingual transfer performs best. We then move on to creating a German system that types general entities (e.g. ``ex-president'') as well as named entities (e.g. ``Obama''), by augmenting our training data with data automatically generated from a German WordNet. We find that this way up to 10 percent points improvement in general entity typing performance can be reached, while only slightly impacting named entity typing performance by 1 percent point. We use these components in the pipeline to construct German entailment graphs. We also present a method that uses German and English entailment graphs to generate training data for a supervised predicate entailment detection system, and show that this method outperforms current approaches at this task. This way we create a multilingual predicate entailment detection system, that outperforms both the monolingual German system and the zero-shot cross-lingual system on German test data, and also performs better than a monolingual English system on English test data.en
dc.language.isoenen
dc.publisherThe University of Edinburghen
dc.relation.hasversionLi, T., Weber, S., Hosseini, M. J., Guillou, L., and Steedman, M. (2022). Cross-lingual inference with a chinese entailment graph. In Proceedings of the Society for Computation in Linguistics.en
dc.relation.hasversionWeber, S. and Steedman, M. (2019). Construction and alignment of multilingual entailment graphs for semantic inference. In Proceedings of the 2019 Workshop on Widening NLP, pages 77–79.en
dc.relation.hasversionWeber, S. and Steedman, M. (2021). Fine-grained general entity typing in german using germanet. In Proceedings of the Fifteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-15), pages 138–143.en
dc.relation.hasversionWeber, S. and Steedman, M. (2021). Zero-shot cross-lingual transfer is a hard baseline to beat in German fine-grained entity typing. In Proceedings of the Second Workshop on Insights from Negative Results in NLP, pages 42–48, Online and Punta Cana, Dominican Republic. Association for Computational Linguisticsen
dc.relation.hasversionThe code relating to German Fine-Grained Entity Typing can be found under https://github.com/webersab/german general entity typing.en
dc.relation.hasversionThe code of the Relation Extraction Pipeline can be found under https://github.com/webersab/relationExtractionPipeline.en
dc.subjectNatural Language Processingen
dc.subjectdistributional inclusion hypothesisen
dc.subjectNLPen
dc.subjectGerman languageen
dc.titleUnsupervised German predicate entailment using the distributional inclusion hypothesisen
dc.typeThesis or Dissertationen
dc.type.qualificationlevelDoctoralen
dc.type.qualificationnamePhD Doctor of Philosophyen


Files in this item

This item appears in the following Collection(s)

Show simple item record