Personalising speech-to-speech translation in the EMIME project
Proc. of the ACL 2010 System Demonstrations
View/ Open
Date
2010Author
Kurimo, Mikko
Byrne, William
Dines, John
Garner, Philip N.
Gibson, Matthew
Guan, Yong
Hirsimaki, Teemu
Karhila, Reima
King, Simon
Liang, Hui
Oura, Keiichiro
Saheer, Lakshmi
Shannon, Matt
Shiota, Sayaka
Tian, Jilei
Tokuda, Keiichi
Wester, Mirjam
Wu, Yi-Jian
Yamagishi, Junichi
Metadata
Abstract
In the EMIME project we have studied unsupervised cross-lingual speaker adaptation. We have employed an HMM statistical framework for both speech recognition and synthesis which provides transformation mechanisms to adapt the synthesized voice in TTS (text-to-speech) using the recognized voice in ASR (automatic speech recognition). An important application for this research is personalised speech-to-speech translation that will use the voice of the speaker in the input language to utter the translated sentences in the output language. In mobile environments this enhances the users' interaction across language barriers by making the output speech sound more like the original speaker's way of speaking, even if she or he could not speak the output language.