Edinburgh Research Archive

Automatically extracting the source words of English lexical blends

dc.contributor.advisor
Goldwater, Sharon
en
dc.contributor.advisor
Deoskar, Tejaswini
en
dc.contributor.author
Kosinowski, Hanne
en
dc.date.accessioned
2014-03-20T12:18:34Z
dc.date.available
2014-03-20T12:18:34Z
dc.date.issued
2012-11-28
dc.description.abstract
Language changes constantly – new words are created on a daily basis. This thesis examines blends in English, a highly productive word formation process where two words are combined to form a new word with a new meaning. In order to allow natural language processing system to handle blends, I present a system that automatically extracts the words comprising the blend using a set of statistical features. Using the features on a corpus consisting of 2236 blends and a logistic regression classifier, I obtain a 50% accuracy on the gold standard. So far, this is the largest corpus of blends used for this task. I compare the results to previous work and provide solutions on how to improve the system’s performance.
en
dc.identifier.uri
http://hdl.handle.net/1842/8470
dc.language.iso
en
dc.publisher
The University of Edinburgh
en
dc.subject
Language Processing
en
dc.subject
Blends
en
dc.title
Automatically extracting the source words of English lexical blends
en
dc.type
Thesis or Dissertation
en
dc.type.qualificationlevel
Masters
en
dc.type.qualificationname
MSc Master of Science
en
dcterms.accessRights
RESTRICTED ACCESS
en

Files

This item appears in the following Collection(s)