Show simple item record

dc.contributor.authorYamagishi, Junichi
dc.contributor.authorWatts, Oliver
dc.date.accessioned2011-04-26T08:37:24Z
dc.date.available2011-04-26T08:37:24Z
dc.date.issued2010
dc.identifier.citationProc. Blizzard Challenge 2010 (Kyoto, Japan).en
dc.identifier.urihttp://hdl.handle.net/1842/4864
dc.descriptionThe European Community’s Seventh Framework Programme (FP7/2007-2013) under Grant agreement 213845 (the EMIME project)
dc.description.abstractIn the 2010 Blizzard Challenge, we focused on improving steps relating to feature extraction and labeling in the procedures for training HMM-based speech synthesis systems. New auditory scales were used for spectral features and F0 representation. We have also adopted finer frequency bands motivated by an auditory-scale for aperiodicity measures, which determine the level of noise in each band for mixed excitation. Further for tighter coupling of the HMM training and automatic labeling processes, we have studied methods for stepwise bootstrap training. The listeners’ evaluation scores were much better than those of HTS-benchmark systems. More importantly, we can see some improvements even in speaker similarity, which was known to be the acknowledged weakness of this method. In fact, speaker similarity is not a weak point of this method on the tasks using smaller databases. In terms of naturalness, the new systems outperformed or competed with unit selection systems regardless of the size of speech databases used and moreover competed with hybrid systems on smaller databases.en
dc.contributor.sponsorEuropean Commission
dc.language.isoenen
dc.subjectSpeech Synthesisen
dc.subjectHMMen
dc.subjectaverage voiceen
dc.subjectspeaker adaptationen
dc.titleThe CSTR/EMIME HTS system for Blizzard Challenge 2010en
dc.typeConference Paperen


Files in this item

This item appears in the following Collection(s)

Show simple item record