Using prosodic structure to improve pitch range variation in text to speech synthesis.
Clark, Robert A J
The intonation produced by current text-to-speech systems is often either flat or artificial sounding. Pitch range is one of the contributing factors which could be improved by more detailed linguistic knowledge. In this study, a corpus of read speech is analysed to provide information about prosodic structure and pitch range, which can be used to improve the intonation models for speech synthesis. The results show how the pitch range variation is most apparent at a tone group level of prosodic structure, and how phrase initial and phrase final tone groups have significantly different pitch ranges from tone groups which are phrase medial.