Kolloquium - Details zum Vortrag

Sie werden über Vorträge rechtzeitig per E-Mail informiert werden, wenn Sie den Newsletter des kommunikationstechnischen Kolloquiums abonnieren.

Alle Interessierten sind herzlich eingeladen, eine Anmeldung ist nicht erforderlich.

Master-Vortrag: Sound Field Conversion Using Machine Learning Methods

Leonie Geyer
Mittwoch, 18. November 2020
14:00 Uhr
virtueller Konferenzraum

The reproduction of realistic sound fields is necessary for the efficient evaluation of modern communication devices. Depending on the application, different microphone arrangements are used to record the sound fields. Not all desired signals are directly available in the required microphone configuration. The goal of sound field conversion is to convert any signal between different recording systems in an identical sound field.

This thesis compares different approaches to convert sound fields. The conversion between the signals of two concrete measurement systems, a binaural artificial head and a microphone array with eight channel, is investigated. Three new approaches, which use artificial neural networks, are developed. First a convolutional approach, which has a simple end-to-end structure. Secondly, a time filter approach. Here the network outputs a FIR filter, which is applied to the input signal. Third a variant of wavenet, which is divided into two subnetworks. One analyses a section of the time signal and output a feature vector, which is available to the conversion network when converting the time signal.

The neural networks are trained using data from real and defined acoustic environments. The performance is measured by metrics in time and frequency domain. As a comparison, sound field conversion by equalising and measuring with the target recording device is performed. Different experiments are conducted on the structure and parametrisation, as well as the training process of the neural networks. Their performance is observed and optimised. The Wavenet variant achieves in all experiments the best results, followed by the convolutional approach. The time filter approach achieves much worse results than the neural approaches. Training with a loss function, which includes the mean square error and frequency metrics, can reduce the error in the frequency domain, although a higher error in the time domain is observed, compared to the pure mean square error as loss function.

zurück