Nonlinear speech synthesis and pitch modification techniques

Mann, Iain

Nonlinear speech synthesis and pitch modification techniques

Simple item page

dc.contributor.advisor

McLaughlin, Stephen

en

dc.contributor.author

Mann, Iain

en

dc.date.accessioned

2006-07-21T15:11:17Z

dc.date.available

2006-07-21T15:11:17Z

dc.date.issued

2000-06

dc.description.abstract

Speech synthesis technology plays an important role in many aspects of man–machine interaction, particularly in telephony applications. In order to be widely accepted, the synthesised speech quality should be as human–like as possible. This thesis investigates novel techniques for the speech signal generation stage in a speech synthesiser, based on concepts from nonlinear dynamical theory. It focuses on natural–sounding synthesis for voiced speech, coupled with the ability to generate the sound at the required pitch. The one–dimensional voiced speech time–domain signals are embedded into an appropriate higher dimensional space, using Takens’ method of delays. These reconstructed state space representations have approximately the same dynamical properties as the original speech generating system and are thus effective models. A new technique for marking epoch points in voiced speech that operates in the state space domain is proposed. Using the fact that one revolution of the state space representation is equal to one pitch period, pitch synchronous points can be found using a Poincar´e map. Evidently the epoch pulses are pitch synchronous and therefore can be marked. The same state space representation is also used in a locally–linear speech synthesiser. This models the nonlinear dynamics of the speech signal by a series of local approximations, using the original signal as a template. The synthesised speech is natural–sounding because, rather than simply copying the original data, the technique makes use of the local dynamics to create a new, unique signal trajectory. Pitch modification within this synthesis structure is also investigated, with an attempt made to exploit the ˇ Silnikov–type orbit of voiced speech state space reconstructions. However, this technique is found to be incompatible with the locally–linear modelling technique, leaving the pitch modification issue unresolved. A different modelling strategy, using a radial basis function neural network to model the state space dynamics, is then considered. This produces a parametric model of the speech sound. Synthesised speech is obtained by connecting a delayed version of the network output back to the input via a global feedback loop. The network then synthesises speech in a free–running manner. Stability of the output is ensured by using regularisation theory when learning the weights. Complexity is also kept to a minimum because the network centres are fixed on a data–independent hyper–lattice, so only the linear–in–the–parameters weights need to be learnt for each vowel realisation. Pitch modification is again investigated, based around the idea of interpolating the weight vector between different realisations of the same vowel, but at differing pitch values. However modelling the inter–pitch weight vector variations is very difficult, indicating that further study of pitch modification techniques is required before a complete nonlinear synthesiser can be implemented.

en

dc.format.extent

2849614 bytes

en

dc.format.mimetype

application/pdf

en

dc.identifier.uri

http://hdl.handle.net/1842/1378

dc.language.iso

en

dc.publisher

University of Edinburgh. College of Science and Engineering. School of Engineering and Electronics

en

dc.subject.other

Ph.D. Thesis

en

dc.title

Nonlinear speech synthesis and pitch modification techniques

en

dc.title.alternative

An Investigation of nonlinear speech synthesis and pitch modification techniques

dc.type

Thesis or Dissertation

en

dc.type.qualificationlevel

Doctoral

en

dc.type.qualificationname

PhD Doctor of Philosophy

en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: mann.pdf
Size:: 2.72 MB
Format:: Adobe Portable Document Format

Download

This item appears in the following Collection(s)

Engineering thesis and dissertation collection