News - Article

PhD Exam | Tobias Kabzinski

Tobias Kabzinski successfully passed his doctoral examination in the field of Electrical Engineering and Information Technology on October 24, 2025. 

The topic of his doctoral thesis is "Signal Processing Algorithms for Adaptive Loudspeaker-Based Binaural Audio Reproduction".

Abstract: Spatial audio reproduction offers a range of promising applications in the context of augmented and virtual reality, for example, immersive teleconferencing. Binaural signal reproduction involves creating separate audio signals for each ear, allowing for natural listening and hence enabling the listener to perceive immersion into a recorded or synthesized acoustic scene. Loudspeaker-based binaural audio reproduction requires to equalize the acoustic direct path and to attenuate the crosstalk by means of a crosstalk cancellation (CTC) filter network. Compared to headphone-based reproduction, loudspeaker-based reproduction offers advantages, such as enhanced externalization. However, today's loudspeaker-based binaural audio reproduction systems suffer from model errors that lead to objectively reduced performance metrics and poor subjective performance in localizing virtual sound sources.

To address the limitations of current loudspeaker-based reproduction systems, the concept for adaptive CTC systems, featuring microphones positioned at the listener's ears, is presented. This setup enables to adapt the CTC filters to the characteristics of the listener and the acoustics of the reproduction room so that detrimental model errors can be avoided.Towards this goal, sophisticated signal processing algorithms are developed for identifying time-varying acoustic systems and for designing CTC filters. These algorithms are related and compared to state-of-the-art methods, and they are evaluated independently and within adaptive CTC systems. The methodology involves both theoretical development of algorithms and experimental validation through simulations and acoustic measurements.

The concept for this type of adaptive CTC systems is proposed. lt involves two major challenges: Firstly, it is mandatory to advance system identification algorithms, exploiting the correlated loudspeaker signals, to address the problem of tracking rapidly time-varying systems, resulting from listener movements. Secondly, it is crucial to overcome the detrimental disadvantages of existing efficient CTC filter design methods.

For identifying time-varying acoustic systems, state-space-based methods to estimate Finite Impulse Response (FIR) filters by means of Kaiman filters are commonly applied. A theoretical connection between time-domain and frequency-domain Kaiman filter formulations is established, and concepts unique to one domain are extended to the other domain. As a result, one novel variant reduces computational complexity while maintaining comparable system identification performance. To estimate state-space model parameters, heuristics are commonly used, which are likely suboptimal. To avoid them, the Expectation Maximization (EM) algorithm is applied to jointly estimate impulse responses and model parameters in a theoretically well-grounded framework. This concept is extended to the identification of Multiple Input Multiple Output (MIMO) systems and modified to impose various model structures. The resulting flexible framework allows to exploit simplifications and consequently enables to adjust the computational complexity and memory requirements to manageable levels. Applying the EM algorithm for system identification provides significantly improved performance over state-of-the-art methods.

The CTC filter design problem is addressed from a control-theory perspective and from a classical FIR-filter-design perspective. Both approaches are theoretically connected and extended by frequency-dependent weightings to balance between equalizing the direct path and attenuating the crosstalk. This is a generalization of CTC filter design methods found in literature. The commonly used least-squares-based methods for CTC filter design in the time-domain and in the frequency-domain are connected in a unified framework, and on this basis a novel method is proposed that fuses the low computational complexity of frequency-domain methods with the precision of the time-domain method. This method allows to gradually adjust between low complexity and high precision. As a result, excellent performance can be maintained while simultaneously reducing complexity by up to multiple orders of magnitude.

A simulation-based proof of concept for adaptive CTC systems is conducted, demonstrating successful adaptation to the listener and the reproduction room. The proposed concept hence offers a promising solution for loudspeaker-based binaural audio reproduction. Beyond their use in adaptive CTC systems, the signal processing algorithms developed in this dissertation have broader applicability. The system identification algorithms are equally applicable in acoustic echo control and in the measurement of time-varying systems, e.g., used in head-related transfer function (HRTF) measurements, which are also essential for headphone-based reproduction of personalized spatial audio. The CTC filter design algorithms are not only applicable in traditional model-based CTC systems to reduce the computational complexity, but also in sound field reproduction, or to design inverse filters, for instance, in active noise control systems.

back