Publications-Detail

Two-Stage Speech Enhancement Using Gated Convolutions

Authors:
Thieling, L.Jax, P.
Book Title:
Proceedings of International Workshop on Acoustic Signal Enhancement (IWAENC)
Organization:
IEEE
Publisher:
IEEE
Pages:
p.p. 1-5
Date:
Sep. 2022
ISBN:
978-1-66546-867-1
DOI:
10.1109/IWAENC53105.2022.9914693
Language:
English

Abstract

Deep neural network (DNN)-based approaches have shown remarkable results in speech enhancement under non-stationary noise conditions and in low signal-to-noise ratio (SNR) environments. One of these approaches is to first perform noise reduction through masking and subsequently restore the removed or distorted speech components, using individual DNNs for each stage. In this paper, we propose a two-stage speech enhancement system consisting of two new convolutional neural network (CNN) architectures for both stages. The architectures are designed in a way that the stages can be combined without early loss of information. That is, instead of performing noise reduction via masking directly in the first stage, we propose to incorporate the estimated soft mask into the second stage, which then serves as an initial rough estimate of the noise. For this, we propose to use gated convolutions in the second stage of the system, which facilitates an automated selection of the most important time-frequency (TF) components by the network itself. Experimental results confirm that the proposed speech enhancement system surpasses state-of-the- art systems in terms of speech quality (PESQ), intelligibility (STOI) and speech distortion (SegSNR).

Download

BibTeX