Edinburgh Research Archive

Improving the intelligibility of speech playback in everyday scenarios

Item Status

Embargo End Date

Authors

Chermaz, Carol

Abstract

Speech playback is widely used across the industrialized world: radio, TV and public address announcements are just some examples. The presence of noise and reverberation may affect intelligibility, in particular for those individuals who experience hearing difficulties. As new technologies that involve speech playback (e.g., smartphones, smart speakers) become more popular, society is paying more attention to accessibility, making speech intelligibility an increasingly relevant topic. We investigate NELE (Near End Listening Enhancement) algorithms, which can be used to boost the intelligibility of speech signals before they are played back. We simulate realistic acoustic environments to test the effectiveness of existing algorithms for different hearing profiles, obtaining more insights than traditional tests run with synthetic noise. We show that such algorithms can improve intelligibility across hearing profiles, in everyday listening scenarios. We then propose two novel NELE methods: one covert and the other overt. The covert strategy entails watermarking speech signals with data that can be read by a receiving device (e.g., a hearing aid), which in turn can use it to perform better speech enhancement. The overt method audibly modifies the signal, by reallocating its energy across time and frequency in a perceptually-motivated way. We consider audio quality to be as important as intelligibility, and show that it is possible to increase both at the same time. The beta version of our Automatic Sound Engineer set a benchmark in the field of NELE by winning the Hurricane Challenge 2.0, while its fully automated version has proven to be suitable also for music production.

This item appears in the following Collection(s)