Tailoring kalman filtering towards speaker characterisation.
This paper describes a method for obtaining smoothed vocal tract parameters from analysis during the closed phase of the glottis. The method is based upon Expectation Maximisation (EM) and uses Kalman-Rauch forward-backward iterations through a voiced segment, in which the speech data during excitation and open phases are excluded by treating them as ‘missing data’. This approach exploits the non-independence of neighbouring spectra and compensates for small numbers of available points, while preserving speaker-characteristic information and tracking variations in it. The vocal tract filter parameters are then used for inverse filtering the speech, thus obtaining estimates of the source excitation. The extracted excitation signal can be used to excite other sets of parameters to produce natural sounding speech.