Show simple item record

dc.contributor.authorGotoh, Yoshihikoen
dc.contributor.authorRenals, Steveen
dc.coverage.spatial8en
dc.date.accessioned2006-05-11T13:11:23Z
dc.date.available2006-05-11T13:11:23Z
dc.date.issued2000-09
dc.identifier.citation[ASR-2000] ASR2000 - Automatic Speech Recognition: Challenges for the new Millenium, ISCA Tutorial and Research Workshop (ITRW), Paris, France, September 18-20, 2000. pp.228-235.
dc.identifier.urihttp://www.isca-speech.org/archive/asr2000/
dc.identifier.urihttp://hdl.handle.net/1842/990
dc.description.abstractThis paper presents an approach to identifying sentence boundaries in broadcast speech transcripts. We describe finite state models that extract sentence boundary information statistically from text and audio sources. An n-gram language model is constructed from a collection of British English news broadcasts and scripts. An alternative model is estimated from pause duration information in speech recogniser outputs aligned with their programme script counterparts. Experimental results show that the pause duration model alone outperforms the language modelling approach and that, by combining these two models, it can be improved further and precision and recall scores of over 70% were attained for the task.en
dc.format.extent139511 bytesen
dc.format.mimetypeapplication/pdfen
dc.language.isoen
dc.publisherInternational Speech Communication Associationen
dc.titleSentence Boundary Detection in Broadcast Speech Transcriptsen
dc.typeConference Paperen


Files in this item

This item appears in the following Collection(s)

Show simple item record