Automatic Utterance Type Detection Using Suprasegmental Features
The goal of the work presented here is to automatically predict the type of an utterance in spoken dialogue by using automatically extracted suprasegmental information. For this task we present and compare three stochastic algorithms: hidden Markov models, artificial neural nets, and classification and regression trees. These models are easily trainable, reasonably robust and fit into the probabilistic framework required for speech recognition. Utterance type detection is dependent on the assumption that different types of utterances have different suprasegmental characteristics. The categorisation of these utterance types is based on the theory of conversation games and consists of 12 move types (e.g. reply to a question, wh-question, acknowledgement). The system is speaker independent and is trained on spontaneous goal-directed dialogue collected from Canadian speakers. This utterance type detector is used in an automatic speech recognition system to reduce word error rate.