Punctuation annotation using statistical prosody models.
This paper is about the development of statistical models of prosodic features to generate linguistic meta-data for spoken language. In particular, we are concerned with automatically punctuating the output of a broadcast news speech recogniser. We present a statistical finite state model that combines prosodic, linguistic and punctuation class features. Experimental results are presented using the Hub-4 Broadcast News corpus, and in the light of our results we discuss the issue of a suitable method of evaluating the present task.