Automatically clustering similar units for unit selection in speech synthesis.

Black, Alan W; Taylor, Paul A

Automatically clustering similar units for unit selection in speech synthesis.

Files

Black_1997_b.pdf (53.15 KB)

Black_1997_b.ps (72.06 KB)

Date

Authors

Abstract

This paper describes a new method for synthesizing speech by concatenating sub-word units from a database of labelled speech. A large unit inventory is created by automatically clustering units of the same phone class based on their phonetic and prosodic context. The appropriate cluster is then selected for a target unit offering a small set of candidate units. An optimal path is found through the candidate units based on their distance from the cluster center and an acoustically based join cost. Details of the method and justification are presented. The results of experiments using two different databases are given, optimising various parameters within the system. Also a comparison with other existing selection based synthesis techniques is given showing the advantages this method has over existing ones. The method is implemented within a full text-to-speech system offering efficient natural sounding speech synthesis.

URI

http://hdl.handle.net/1842/1236

This item appears in the following Collection(s)

CSTR publications