Publications-Detail

Listening Enhancement in Noisy Environments: Solutions in Time and Frequency Domain

Authors:
Niermann, M.Vary, P.
Journal:
IEEE Transactions on Audio, Speech, and Language Processing
Volume:
29
Page(s):
699-709
Date:
Dec. 2020
ISSN:
2329-9304
DOI:
10.1109/TASLP.2020.3047234
Language:
English

Abstract

The intelligibility of speech from a telephone or a public address system is often affected by acoustical background noise in the near-end listening environment. Speech intelligibility and listening effort can be improved by adaptive pre-processing of the loudspeaker signal. This is called Near-End Listening Enhancement (NELE). The speech spectrum is dynamically modified, taking the acoustical background noise at the near-end into account.
In this paper, two opposite NELE strategies with either Noise-Masking-Proportional Shaping or Noise-Masking-Inverse Shaping are proposed which are appropriate for different noise characteristics. Both strategies are formulated in closed form in the frequency domain. They do not require to optimize an intelligibility measure but use explicitly the masking threshold. Motivated by the frequency domain approach, a simpler time domain solution is derived which is based on linear prediction techniques and does not need the masking calculations. The proposed NELE solutions outperform state-of-the-art in terms of computational complexity, memory requirement, continuous processor load, and latency.

Download

BibTeX

Copyright © by IKS
niermann20a.pdf
This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.