Edinburgh Research Archive

Assigning phrase breaks from part-of-speech sequences

dc.contributor.author
Taylor, Paul
en
dc.contributor.author
Black, Alan W
en
dc.coverage.spatial
19
en
dc.date.accessioned
2006-06-20T15:21:47Z
dc.date.available
2006-06-20T15:21:47Z
dc.date.issued
1998-04
dc.description.abstract
This paper presents an algorithm for automatically assigning phrase breaks to unrestricted text for use in a text-to-speech synthesizer. Text is first converted into a sequence of part-of-speech tags. Next a Markov model is used to give the most likely sequence of phrase breaks for the input part-of-speech tags. In the Markov model, states represent types of phrase break and the transitions between states represent the likelihoods of sequences of phrase types occurring. The paper reports a variety of experiments investigating part-of-speech tag-sets, Markov model structure and smoothing. The best setup correctly identifies 79% of breaks in the test corpus.
en
dc.format.extent
278629 bytes
en
dc.format.mimetype
application/pdf
en
dc.identifier.citation
Computer Speech and Language (1998) 12, 99-117.
dc.identifier.issn
0885-2308
dc.identifier.uri
http://dx.doi.org/10.1006/csla.1998.0041
dc.identifier.uri
http://hdl.handle.net/1842/1258
dc.language.iso
en
dc.publisher
Academic Press
en
dc.title
Assigning phrase breaks from part-of-speech sequences
en
dc.type
Article
en

Files

Original bundle

Now showing 1 - 1 of 1
Name:
Taylor 99.pdf
Size:
272.1 KB
Format:
Adobe Portable Document Format

This item appears in the following Collection(s)