Colloquium - Details

You will receive information about presentations in time if you subscribe to the newsletter of the Colloquium Communications Technology.

All interested students are cordially invited, registration is not required.

Master-Presentation: Optimization of an Instrumental Audio Quality Assessment Approach Using Machine Learning Methods

Elgiz Coskun
Friday, 20th October 2023

10:45 AM
IKS 4G | hybrid

The assessment of audio system playback quality involves diverse methods, including auditory tests, technical parameter measurements, and instrumental evaluation techniques. The instrumental methods emulate human auditory perception using algorithmic steps, transforming analysis results into perceptual scales. Recent advancements include applying Machine Learning (ML) and Deep Learning (DL) to instrumental assessment, enhancing prediction quality and operational efficiency. This study proposes a double-ended model for audio quality prediction, aiming to match the prediction quality of an existing method, called Multi-Dimensional Audio Quality Score (MDAQS), while improving efficiency. A Deep Neural Network (DNN) model is designed, utilizing a Convolutional Neural Network (CNN)-Encoder for feature extraction, Self-Attention for time-weighting, and specialized attention-pooling. Data is gathered from binaural measurements using various audio systems and augmented to enhance model resilience. Preprocessing includes labeling and domain transformation using a sophisticated hearing model. The model is trained with preprocessed labeled data, and its prediction results are compared to the target scores obtained via MDAQS. Next, the pretrained model is extended to predict quality dimensions directly comparable to auditory results in listening tests. Parameters of the pretrained model are kept fixed during the second training phase due to limited auditory data. Predictions are evaluated using metrics accounting for auditory result uncertainty.