Proto-phoneme reconstruction as naive Bayes inference

Maftei, Dan

Proto-phoneme reconstruction as naive Bayes inference

Item Status

RESTRICTED ACCESS

Date

2012-11-27

Authors

Maftei, Dan

Full item page

Abstract

The comparative method is the standard technique by which historical linguists reconstruct ancestral languages from their descendants. The method, however, has received little attention from the computational linguistics community. We present a principled method by which sound change plausibility can be encoded, and a probabilistic framework for learning about sound change and using this knowledge for reconstruction. Our techniques are entirely probabilistic, and leverage the wealth of data that is becoming available in a machine-readable format. We show that a Naive Bayes classifier, combined with phoneme-conditioned categorical distributions over phonemes, learned via Maximum a Posteriori with smoothing based on phonetic similarity, can be used to accurately reconstruct proto-words from their descendants. Our system out-performs previous approaches to language reconstruction.

URI

http://hdl.handle.net/1842/8593

This item appears in the following Collection(s)

Linguistics and English Language Masters thesis collection