Colloquium - Details
You will receive information about presentations in time if you subscribe to the newsletter of the Colloquium Communications Technology.
Master-Vortrag: Model-based Voice Activity Detection
Geoffroy Vanderreydt
19. September 2016
11:00 Uhr
Hörsaal 4G IKS
The goal of Voice Activity Detection (VAD) is to detect segments of a noisy speech signal where speech is present. In this thesis we present and analyze two existing model-based VADs: a statistical model-based VAD proposed by Sohn and a self-adaptive MFCC-model-based VAD proposed by Kinnunen. Based on this analysis, we attempt two ways to modify and possibly improve Kinnunen's VAD method. The models in this VAD approach are codebooks and our first modication consists of extending the codebooks to Gaussian Mixture Models (GMM). We obtain similar results as with codebooks but the soft decisions become more binary. Kinnunen's method is self-adaptive thanks to an initial (energy-based) VAD that selects reliable noise-only and speech+noise frames as training vectors. In the analysis part, we observe that there is room for improvement by better selecting the training vectors. Therefore, our second modication consists of using other initial VADs: Sohn's method and a back-to-back approach of Kinnunen's method. This modication doesn't improve the VAD performance and we explain in this thesis possible reasons for that.
