Edinburgh Research Archive

Articulatory feature-based methods for acoustic and audio-visual speech recognition: Summary from the 2006 JHU Summer Workshop.

dc.contributor.author
Livescu, Karen
en
dc.contributor.author
Çetin, Ozgur
en
dc.contributor.author
Hasegawa-Johnson, Mark
en
dc.contributor.author
King, Simon
en
dc.contributor.author
Bartels, Chris
en
dc.contributor.author
Borges, Nash
en
dc.contributor.author
Kantor, Arthur
en
dc.contributor.author
Lal, Partha
en
dc.contributor.author
Yung, Lisa
en
dc.contributor.author
Bezman, Ari
en
dc.contributor.author
Dawson-Haggerty, Stephen
en
dc.contributor.author
Woods, Bronwyn
en
dc.contributor.author
Frankel, Joe
en
dc.contributor.author
Magimai-Doss, Mathew
en
dc.contributor.author
Saenko, Kate
en
dc.date.accessioned
2007-09-18T10:01:47Z
dc.date.available
2007-09-18T10:01:47Z
dc.date.issued
2007
dc.description.abstract
We report on investigations, conducted at the 2006 Johns HopkinsWorkshop, into the use of articulatory features (AFs) for observation and pronunciation models in speech recognition. In the area of observation modeling, we use the outputs of AF classiers both directly, in an extension of hybrid HMM/neural network models, and as part of the observation vector, an extension of the tandem approach. In the area of pronunciation modeling, we investigate a model having multiple streams of AF states with soft synchrony constraints, for both audio-only and audio-visual recognition. The models are implemented as dynamic Bayesian networks, and tested on tasks from the Small-Vocabulary Switchboard (SVitchboard) corpus and the CUAVE audio-visual digits corpus. Finally, we analyze AF classication and forced alignment using a newly collected set of feature-level manual transcriptions.
en
dc.format.extent
143533 bytes
en
dc.format.mimetype
application/pdf
en
dc.identifier.citation
K. Livescu, O. Çetin, M. Hasegawa-Johnson, S. King, C. Bartels, N. Borges, A. Kantor, P. Lal, L. Yung, S. Bezman, Dawson-Haggerty, B. Woods, J. Frankel, M. Magimai-Doss, and K. Saenko. Articulatory feature-based methods for acoustic and audio-visual speech recognition: Summary from the 2006 JHU Summer Workshop. In Proc. ICASSP, Honolulu, April 2007.
dc.identifier.uri
http://hdl.handle.net/1842/1998
dc.language.iso
en
dc.subject
speech technology
en
dc.title
Articulatory feature-based methods for acoustic and audio-visual speech recognition: Summary from the 2006 JHU Summer Workshop.
en
dc.type
Conference Paper
en

Files

Original bundle

Now showing 1 - 1 of 1
Name:
livescu_icassp07_sum.pdf
Size:
140.17 KB
Format:
Adobe Portable Document Format

This item appears in the following Collection(s)