Edinburgh Research Archive

Acquiring syntactic and semantic transformations in question answering

dc.contributor.advisor
Webber, Bonnie
en
dc.contributor.advisor
Lowe, John B.
en
dc.contributor.author
Kaisser, Michael
en
dc.contributor.sponsor
Powerset, Inc.
en
dc.contributor.sponsor
Microsoft Research
en
dc.date.accessioned
2010-10-13T08:24:08Z
dc.date.available
2010-10-13T08:24:08Z
dc.date.issued
2010
dc.description.abstract
One and the same fact in natural language can be expressed in many different ways by using different words and/or a different syntax. This phenomenon, commonly called paraphrasing, is the main reason why Natural Language Processing (NLP) is such a challenging task. This becomes especially obvious in Question Answering (QA) where the task is to automatically answer a question posed in natural language, usually in a text collection also consisting of natural language texts. It cannot be assumed that an answer sentence to a question uses the same words as the question and that these words are combined in the same way by using the same syntactic rules. In this thesis we describe methods that can help to address this problem. Firstly we explore how lexical resources, i.e. FrameNet, PropBank and VerbNet can be used to recognize a wide range of syntactic realizations that an answer sentence to a given question can have. We find that our methods based on these resources work well for web-based Question Answering. However we identify two problems: 1) All three resources as of yet have significant coverage issues. 2) These resources are not suitable to identify answer sentences that show some form of indirect evidence. While the first problem hinders performance currently, it is not a theoretical problem that renders the approach unsuitable–it rather shows that more efforts have to be made to produce more complete resources. The second problem is more persistent. Many valid answer sentences–especially in small, journalistic corpora–do not provide direct evidence for a question, rather they strongly suggest an answer without logically implying it. Semantically motivated resources like FrameNet, PropBank and VerbNet can not easily be employed to recognize such forms of indirect evidence. In order to investigate ways of dealing with indirect evidence, we used Amazon’s Mechanical Turk to collect over 8,000 manually identified answer sentences from the AQUAINT corpus to the over 1,900 TREC questions from the 2002 to 2006 QA tracks. The pairs of answer sentences and their corresponding questions form the QASP corpus, which we released to the public in April 2008. In this dissertation, we use the QASP corpus to develop an approach to QA based on matching dependency relations between answer candidates and question constituents in the answer sentences. By acquiring knowledge about syntactic and semantic transformations from dependency relations in the QASP corpus, additional answer candidates can be identified that could not be linked to the question with our first approach.
en
dc.identifier.uri
http://hdl.handle.net/1842/3947
dc.language.iso
en
dc.publisher
The University of Edinburgh
en
dc.subject
Question Answering
en
dc.subject
Natural Language Processing
en
dc.title
Acquiring syntactic and semantic transformations in question answering
en
dc.type
Thesis or Dissertation
en
dc.type.qualificationlevel
Doctoral
en
dc.type.qualificationname
PhD Doctor of Philosophy
en

Files

Original bundle

Now showing 1 - 1 of 1
Name:
Kaisser2010.pdf
Size:
1.14 MB
Format:
Adobe Portable Document Format

This item appears in the following Collection(s)