Publications-Detail

Near-End Listening Enhancement: Theory and Application

Authors:
Sauert, B.
Editors:
Vary, P.
Ph. D. Dissertation
 
School:
IND, RWTH Aachen University
Adress:
Templergraben 55, 52056 Aachen
Series:
Aachener Beiträge zu Digitalen Nachrichtensystemen (ABDN)
Number:
36
Date:
May. 2014
ISBN:
978-3-86130-729-7
Language:
English

Abstract

Mobile telephony is often conducted in the presence of acoustical background noise such as traffic or babble noise. In this situation, the near-end listener perceives a mixture of clean far-end (downlink) speech and environmental noise from the near-end side, which goes along with an increased listening effort and possibly reduced speech intelligibility. As in many cases the noise signal cannot be influenced, the manipulation of the far-end signal is the only way to effectively improve speech intelligibility and to ease listening effort for the near-end listener by digital signal processing. We call this approach near-end listening enhancement (NELE).
In this thesis, innovative solutions for the problem of near-end listening enhancement are developed. These optimize the intelligibility of the far-end speech in local background noise with respect to the objective criterion Speech Intelligibility Index (SII). In contrast to state-of-the-art techniques, the developed methods tackle the problem for the first time from the application perspective considering also the requirements and restrictions of realistic scenarios such as in mobile phones. It is of particular importance that the processing adapts dynamically to the sound characteristics of the ambient noise. Hence, an effective intelligibility enhancement is provided in the presence of background noise, while in silence no audible modification is applied. The utilized noise tracking algorithm estimates the noise spectrum blindly from the microphone signal, the only access to the acoustical environment. Furthermore, a power limitation in critical bands ensures that the ear of the near-end listener is protected from damage and pain.
In mobile phones, the restrictions of the so-called micro-loudspeakers need to be considered and were thus experimentally evaluated and modeled in this thesis. Especially the maximum thermal load of the micro-loudspeaker constitutes a major limitation. This leads to an optimization of the SII with the constraint that the total audio power may only be increased up to a maximum power.
Besides the protection of the human ear, damage of the loudspeaker due to excessive excursions of the membrane or overheating must be prevented. Therefore, a loudspeaker protection scheme for mobile phones with a frequency dependent limitation has been developed. In contrast to the human ear protection, much shorter attack time constants are required. This leads to tight constraints on the filterbank design.
Although the presented algorithms for near-end listening enhancement are driven by real application constraints, this thesis also includes the derivation of theoretical bounds, instrumental measures, and auditory evaluations. As a result, significant improvements of speech intelligibility under adverse acoustical conditions are achieved. In the most difficult scenario where an increase of total audio power is not allowed, the word recognition rate improves with the proposed algorithms by up to 22 percentage points.
It is shown, that the developed new concepts can also be applied in different devices such as mobile phones, headphones, hands-free conference terminals, car multimedia systems, public address systems, and hearing aids.

Download

BibTeX

Copyright © by IKS
sauert14.pdf
This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.