News - Artikel

Promotionsprüfung von Dr.-Ing. Matthias Schrammen

Dr.-Ing. Matthias Schrammen berichtete am Vormittag in einem 45-minütigen Vortrag über seine Arbeiten.

Am Nachmittag hat er seine mündliche Promotionsprüfung erfolgreich abgeschlossen zum Thema:
"Front-End Signal Processing for Far-Field Speech Communication

Abstract: Devices for speech communication operated in handsfree mode offer a very natural way of human communication, because the user can move freely in relation to the device. However, the signal-to-noise ratio (SNR) at the microphones of the device is typically low due to propagation loss, reverberation and interfering sounds such as echo or environmental noise. This requires appropriate front-end signal processing (FESP) to enhance the desired speech signal.

Nowadays more than one communication device is typically present in a smart home environment or in a conference meeting room. A beamformer (BF) can use the microphones of multiple devices to compensate for the low initial SNR, if all microphone positions are known. For estimating these positions the novel orthogonal geometric projection (OGP) approach is proposed. OGP needs only two acoustic events like speech or hand claps for estimation and thus puts very low effort on the user.
For allowing a full-duplex speech communication, one acoustic echo canceller (AEC) per microphone channel is usually employed prior to the BF, which results in a high complexity. Therefore, change prediction (ChaP) is proposed that enables the use of a single AEC after the BF. By collecting information on the acoustic system over time, ChaP can facilitate the adaptation of the AEC such that this low-complexity single-AEC configuration can approach the performance of the high-complexity multi-AEC variant.

Conventional linear AEC is actually insufficient for mobile consumer devices, because their low-cost loudspeakers and amplifiers turned up to a high volume show a significant nonlinear behavior. The novel dual-stage multi-channel Kalman (DualStage-MCK) algorithm also compensates for these nonlinear effects and does not suffer from limited modelling capabilities, slow tracking or high computational complexity, which are typical drawbacks of state-of-the-art solutions.
The performance of the proposed solutions is evaluated in typical use cases and on realistic test data that includes device-specific acoustic shadowing and nonlinear effects acquired from specifically manufactured tablet, smart speaker and smartphone mockups.