Publications-Detail

Speech-Codebook Based Soft Voice Activity Detection

Authors:
Heese, F.Niermann, M.Vary, P.
Book Title:
Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
Organization:
IEEE
Pages:
p.p. 4335-4339
Date:
Apr. 2015
DOI:
10.1109/ICASSP.2015.7178789
Language:
English

Abstract

A novel noise-robust soft Voice Activity Detector (VAD) operating in the short-time Fourier domain is presented. A speech energy gain is obtained by frame-wise processing of a noisy speech signal with a speech codebook algorithm. This gain can be used for robust voice detection. A speaker-independent speech codebook, consisting of spectral envelopes, is created in the training process. While applying the algorithm, the codebook is adapted in every frame to the current speaker by combining the harmonic pitch structure of the actual noisy speech frame with the codebook entries. Soft VAD values ranging from zero to one are calculated by post-processing of the speech gain which is obtained using gain shape vector quantization. A binary VAD is carried out by applying a threshold. The proposed method does not rely on noise a-priori knowledge and is robust w.r.t. highly non-stationary noise and adverse SNR conditions. In addition, it is possible to compromise between the detection-rate and the false-alarm-rate by varying a threshold without increasing the total number of mis-detections. Compared to state-of-the-art VAD systems, the proposed method is characterized by better detection-rates at significant lower false-alarm-rates.

Download

BibTeX

Copyright © by IEEE
heese15a.pdf
© 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.