Generating Synthetic Pitch Contours Using Prosodic Structure.
Date
06/2003Author
Clark, Robert A J
Metadata
Abstract
This thesis addresses the problem of generating a range of natural sounding pitch
contours for speech synthesis to convey the specific meanings of different intonation
patterns.
Where other models can synthesise intonation adequately for short sentences,
longer sentences often sound unnatural as phrasing is only really considered at
the sentence level. We build models within a framework of prosodic structure
derived from the linguistic analysis of a corpus of speech. We show that the use
of appropriate prosodic structure allows us to produce better contours for longer
sentences and allows us to capture the original style of the corpus. The resulting
model is also sufficiently flexible to be adapted to suitable styles for use in other
domains.
To convey specific meanings we need to be able to generate different accent
types. We find that the infrequency of some accent and boundary types makes
them hard to model from the corpus alone. We address this issue by developing
a model which allows us to isolate the parameters which control specific accent
type shapes, so that we can reestimate these parameters based on other data.