Prosodic phrase segmentation by pitch pattern clustering
This paper proposes a novel method for detecting the optimal sequence of prosodic phrases from continuous speech based on data-driven approach. The pitch pattern of input speech is divided into prosodic segments which minimized the overall distortion with pitch pattern templates of accent phrases by using the One Pass search algorithm. The pitch pattern templates are designed by clustering a large number of training samples of accent phrases. On the ATR continuous speech database uttered by 10 speakers, the rate of correct segmentation was 91.7% maximum for the same sex data of training and testing, 88.6% for the opposite sex.