From text to prosody without ToBI
View/ Open
Date
09/2002Author
Strom, Volker
Metadata
Abstract
A new method for predicting prosodic parameters, i.e. phone durations and F0 targets, from preprocessed text is presented. The prosody model comprises a set of CARTs, which are learned from a large database of labeled speech. This database need not be annotated with Tone and Break Indices (ToBI labels). Instead, a
simpler symbolic prosodic description is created by a bootstrapping method. The method had been applied to one Spanish and two German speakers. For the German voices, two listening tests
showed a significant preference for the new method over a more traditional approach of prosody prediction, based on hand-crafted
rules.