Bayesian modelling of vowel segment duration for text-to-speech synthesis using distinctive features
Goubanova, Olga V
We report the results of applying the Bayesian Belief Network (BN) approach to predicting vowel duration. A Bayesian inference of the vowel duration is performed on a hybrid Bayesian network consisting of discrete and continuous nodes, with the nodes in the network representing the linguistic factors that affect segment duration. New to the present research, we model segment identity factor as a set of distinctive features. The features chosen were height, frontness, length, and roundness. We also experimented with a word class feature that implicitly represents word frequency information. We contrasted the results of the belief network model with those of the sums of products (SoP) model and classification and regression tree (CART) model. We trained and tested all three models on the same data. In terms of the RMS error and correlation coefficient, our BN model performs no worse than SoP model, and it significantly outperforms CART model.