Augmentation of adaptation data
Proc. Interspeech 2010
dc.contributor.author | Vipperla, Ravi Chander | |
dc.contributor.author | Renals, Steve | |
dc.contributor.author | Frankel, Joe | |
dc.date.accessioned | 2011-01-19T12:36:57Z | |
dc.date.available | 2011-01-19T12:36:57Z | |
dc.date.issued | 2010 | en |
dc.identifier.uri | http://hdl.handle.net/1842/4661 | |
dc.description.abstract | Linear regression based speaker adaptation approaches can improve Automatic Speech Recognition (ASR) accuracy significantly for a target speaker. However, when the available adaptation data is limited to a few seconds, the accuracy of the speaker adapted models is often worse compared with speaker independent models. In this paper, we propose an approach to select a set of reference speakers acoustically close to the target speaker whose data can be used to augment the adaptation data. To determine the acoustic similarity of two speakers, we propose a distance metric based on transforming sample points in the acoustic space with the regression matrices of the two speakers. We show the validity of this approach through a speaker identification task. ASR results on SCOTUS and AMI corpora with limited adaptation data of 10 to 15 seconds augmented by data from selected reference speakers show a significant improvement in Word Error Rate over speaker independent and speaker adapted models. | en |
dc.title | Augmentation of adaptation data | en |
dc.type | Conference Paper | en |
rps.title | Proc. Interspeech 2010 | en |
dc.date.updated | 2011-01-19T12:36:57Z |