Publications-Detail

Improving Intelligibility in Noise of HMM-Generated Speech via Noise-Dependent and -Independent Methods

Authors:
Valentini-Botinhao, C. ,  Godoy, E. ,  Stylianou, Y. ,  Sauert, B. ,  King, S. ,  Yamagishi, J.
Book Title:
Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
Organization:
IEEE
Pages:
p.p. 7854-7858
Date:
May. 2013
DOI:
10.1109/ICASSP.2013.6639193
Language:
English

Abstract

In order to improve the intelligibility of HMM-generated Text-to-Speech (TTS) in noise, this work evaluates several speech enhancement methods, exploring combinations of noise-independent and -dependent approaches as well as algorithms previously developed for natural speech. We evaluate one noise-dependent method proposed for TTS, based on the glimpse proportion measure, and three approaches originally proposed for natural speech - one that estimates the noise and is based on the speech intelligibility index, and two noise-independent methods based on different spectral shaping techniques followed by dynamic range compression. We demonstrate how these methods influence the average spectra for different phone classes. We then present results of a listening experiment with speech-shaped noise and a competing speaker. A few methods made the TTS voice even more intelligible than the natural one. Although noise-dependent methods did not improve gains, the intelligibility differences found in distinct noises motivates such dependency.

Download

BibTeX

Copyright © by IEEE
valentini-botinhao13.pdf
© 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.