Modeling prosodic features in language models for meetings.
View/ Open
Date
2007Author
Huang, Songfang
Renals, Steve
Metadata
Abstract
In this paper we investigate the application of a novel technique for language modeling - a hierarchical Bayesian language model (LM) based on the Pitman-Yor process - on automatic speech recognition (ASR) for multiparty meetings. The hierarchical Pitman-Yor language model (HPYLM), which was originally proposed in the machine learning field, provides a Bayesian interpretation to language modeling. An approximation to the HPYLM recovers the exact formulation of the interpolated Kneser-Ney smoothing method in n-gram models. This paper focuses on the application and scalability of HPYLM on a practical large vocabulary ASR system. Experimental results on NIST RT06s evaluation meeting data verify that HPYLM is a competitive and promising language modeling technique, which consistently performs better than interpolated Kneser-Ney and modified Kneser-Ney n-gram LMs in terms of both perplexity (PPL) and word error rate (WER).