Edinburgh Research Archive

Evaluation of the Vulnerability of Speaker Verification to Synthetic Speech

dc.contributor.author
De Leon, P.L.
en
dc.contributor.author
Pucher, M.
en
dc.contributor.author
Yamagishi, Junichi
en
dc.date.accessioned
2011-01-19T11:11:50Z
dc.date.available
2011-01-19T11:11:50Z
dc.date.issued
2010
dc.date.updated
2011-01-19T11:11:50Z
dc.description.abstract
In this paper, we evaluate the vulnerability of a speaker verification (SV) system to synthetic speech. Although this problem was first examined over a decade ago, dramatic improvements in both SV and speech synthesis have renewed interest in this problem. We use a HMM-based speech synthesizer, which creates synthetic speech for a targeted speaker through adaptation of a background model and a GMM-UBM-based SV system. Using 283 speakers from the Wall-Street Journal (WSJ) corpus, our SV system has a 0.4% EER. When the system is tested with synthetic speech generated from speaker models derived from the WSJ journal corpus, 90% of the matched claims are accepted. This result suggests a possible vulnerability in SV systems to synthetic speech. In order to detect synthetic speech prior to recognition, we investigate the use of an automatic speech recognizer (ASR), dynamic-timewarping (DTW) distance of mel-frequency cepstral coefficients (MFCC), and previously-proposed average inter-frame difference of log-likelihood (IFDLL). Overall, while SV systems have impressive accuracy, even with the proposed detector, high-quality synthetic speech can lead to an unacceptably high acceptance rate of synthetic speakers.
en
dc.identifier.uri
http://hdl.handle.net/1842/4659
dc.title
Evaluation of the Vulnerability of Speaker Verification to Synthetic Speech
en
dc.type
Conference Paper
en
rps.title
Proc. Odyssey (The speaker and language recognition workshop) 2010
en

Files

Original bundle

Now showing 1 - 1 of 1
Name:
main_v2.pdf
Size:
873.52 KB
Format:
Adobe Portable Document Format

This item appears in the following Collection(s)