Single Speaker Segmentation and Inventory Selection Using Dynamic Time Warping Self Organization and Joint Multigram Mapping
dc.contributor.author
Aylett, Matthew
en
dc.contributor.author
King, Simon
en
dc.date.accessioned
2010-10-05T07:56:48Z
dc.date.available
2010-10-05T07:56:48Z
dc.date.issued
2008
dc.date.openingDate
2008
dc.date.updated
2010-10-05T07:56:49Z
dc.description.abstract
In speech synthesis the inventory of units is decided by inspection and on the basis of phonological and phonetic expertise. The ephone (or emergent phone) project at CSTR is investigating how self organisation techniques can be applied to build an inventory based on collected acoustic data together with the constraints of a synthesis lexicon. In this paper we will describe a prototype inventory creation method using dynamic time warping (DTW) for acoustic clustering and a joint multigram approach for relating a series of symbols that represent the speech to these emerged units. We initially examined two symbol sets: 1) A baseline of standard phones 2) Orthographic symbols. The success of the approach is evaluated by comparing word boundaries generated by the emergent phones against those created using state-of-the-art HMM segmentation. Initial results suggest the DTW segmentation can match word boundaries with a root mean square error (RMSE) of 35ms. Results from mapping units onto phones resulted in a higher RMSE of 103ms. This error was increased when multiple multigram types were added and when the default unit clustering was altered from 40 (our baseline) to 10. Results for orthographic matching had a higher RMSE of 125ms. To conclude we discuss future work that we believe can reduce this error rate to a level sufficient for the techniques to be applied to a unit selection synthesis system.
en
dc.identifier.uri
http://hdl.handle.net/1842/3827
dc.title
Single Speaker Segmentation and Inventory Selection Using Dynamic Time Warping Self Organization and Joint Multigram Mapping
en
dc.type
Conference Paper
en
rps.title
In SSW06, pages 258-263, 2008.
en
Files
Original bundle
1 - 1 of 1
- Name:
- ssw06.pdf
- Size:
- 408.13 KB
- Format:
- Adobe Portable Document Format
This item appears in the following Collection(s)

