Automatically extracting the source words of English lexical blends

Kosinowski, Hanne

Automatically extracting the source words of English lexical blends

Simple item page

dc.contributor.advisor

Goldwater, Sharon

en

dc.contributor.advisor

Deoskar, Tejaswini

en

dc.contributor.author

Kosinowski, Hanne

en

dc.date.accessioned

2014-03-20T12:18:34Z

dc.date.available

2014-03-20T12:18:34Z

dc.date.issued

2012-11-28

dc.description.abstract

Language changes constantly – new words are created on a daily basis. This thesis examines blends in English, a highly productive word formation process where two words are combined to form a new word with a new meaning. In order to allow natural language processing system to handle blends, I present a system that automatically extracts the words comprising the blend using a set of statistical features. Using the features on a corpus consisting of 2236 blends and a logistic regression classifier, I obtain a 50% accuracy on the gold standard. So far, this is the largest corpus of blends used for this task. I compare the results to previous work and provide solutions on how to improve the system’s performance.

en

dc.identifier.uri

http://hdl.handle.net/1842/8470

dc.language.iso

en

dc.publisher

The University of Edinburgh

en

dc.subject

Language Processing

en

dc.subject

Blends

en

dc.title

Automatically extracting the source words of English lexical blends

en

dc.type

Thesis or Dissertation

en

dc.type.qualificationlevel

Masters

en

dc.type.qualificationname

MSc Master of Science

en

dcterms.accessRights

RESTRICTED ACCESS

en

This item appears in the following Collection(s)

Linguistics and English Language Masters thesis collection