Improving the intelligibility of speech playback in everyday scenarios

Chermaz, Carol

Improving the intelligibility of speech playback in everyday scenarios

Simple item page

dc.contributor.advisor

King, Simon

dc.contributor.advisor

Botinhao, Cassia Valentini

dc.contributor.author

Chermaz, Carol

dc.contributor.sponsor

European Commission

en

dc.date.accessioned

2023-11-30T15:38:15Z

dc.date.available

2023-11-30T15:38:15Z

dc.date.issued

2023-11-30

dc.description.abstract

Speech playback is widely used across the industrialized world: radio, TV and public address announcements are just some examples. The presence of noise and reverberation may affect intelligibility, in particular for those individuals who experience hearing difficulties. As new technologies that involve speech playback (e.g., smartphones, smart speakers) become more popular, society is paying more attention to accessibility, making speech intelligibility an increasingly relevant topic. We investigate NELE (Near End Listening Enhancement) algorithms, which can be used to boost the intelligibility of speech signals before they are played back. We simulate realistic acoustic environments to test the effectiveness of existing algorithms for different hearing profiles, obtaining more insights than traditional tests run with synthetic noise. We show that such algorithms can improve intelligibility across hearing profiles, in everyday listening scenarios. We then propose two novel NELE methods: one covert and the other overt. The covert strategy entails watermarking speech signals with data that can be read by a receiving device (e.g., a hearing aid), which in turn can use it to perform better speech enhancement. The overt method audibly modifies the signal, by reallocating its energy across time and frequency in a perceptually-motivated way. We consider audio quality to be as important as intelligibility, and show that it is possible to increase both at the same time. The beta version of our Automatic Sound Engineer set a benchmark in the field of NELE by winning the Hurricane Challenge 2.0, while its fully automated version has proven to be suitable also for music production.

en

dc.identifier.uri

https://hdl.handle.net/1842/41247

dc.identifier.uri

http://dx.doi.org/10.7488/era/3983

dc.language.iso

en

dc.publisher

The University of Edinburgh

en

dc.relation.hasversion

Chermaz, C., Valentini-Botinhao, C., Schepker, H., and King, S. (2019). Near end listening enhancement in realistic environments. In 23rd International Congress on Acoustics, pages 5731–5735

en

dc.relation.hasversion

Chermaz, C., Valentini-Botinhao, C., Schepker, H., and King, S. (2019). Evaluating Near End Listening Enhancement Algorithms in Realistic Environments. In Proc. Interspeech 2019, pages 1373–1377. Best Student Paper

en

dc.relation.hasversion

Chermaz, C. and King, S. (2020). A sound engineering approach to near end listening enhancement. In Proc. Interspeech 2020, pages 1356–1360. Presents ASE beta, winner of the Hurricane Challenge 2.0

en

dc.relation.hasversion

Chermaz, C., Leuchtmann, D., Tanner, S., and Wattenhofer, R. (2021). Compressed representation of cepstral coefficients via recurrent neural networks for informed speech enhancement. In Proc. ICASSP 2021, pages 731–735

en

dc.relation.hasversion

P. V., M. S., Chermaz, C., Chimona, T., Tsiaras, V., and Stylianou, Y. (2019). Benefits of the wavenet-based speech intelligibility enhancement for normal and hearing impaired listeners. In 23rd International Congress on Acoustics, pages 5721-5725.

en

dc.relation.hasversion

Muzzi, E., Chermaz, C., Castro, V., Zaninoni, M., Saksida, A., and Orzan, E. (2021). Short report on the effects of SARS-COV-2 face protective equipment on verbal communication. European Archives of Oto-rhinolaryngology, 278(9):3565–3570.

en

dc.relation.hasversion

Chermaz, C., P. V., M. S., Raman, S., Govender, A., Paul, D., and Simantiraki, O. (2020). Enriched speech for effortless listening. Presented at ICASSP 2020 Show and Tell.

en

dc.subject

Near End Listening Enhancement

en

dc.subject

Speech Technology

en

dc.subject

Hearing Technology

en

dc.subject

Automatic Sound Engineer

en

dc.subject

Realistic listening scenarios

en

dc.subject

Realistic noise

en

dc.subject

Speech pre-enhancement

en

dc.subject

Automatic vocal mastering

en

dc.title

Improving the intelligibility of speech playback in everyday scenarios

en

dc.type

Thesis or Dissertation

en

dc.type.qualificationlevel

Doctoral

en

dc.type.qualificationname

PhD Doctor of Philosophy

en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: ChermazC_2023.pdf
Size:: 37.48 MB
Format:: Adobe Portable Document Format
Description:

Download

This item appears in the following Collection(s)

Informatics thesis and dissertation collection