Abstract

Until now, marginalization-based Missing Feature Theory (MFT) for speech classification has been limited to the use of Log Spectral Subband Energies (LSSEs) as features. These features are highly correlated, thus suboptimal for classification with diagonal-covariance Gaussian Mixture Models (GMMs), a common classifier in marginalization-based MFT. In this paper, we propose that Spectral Subband Centroids (SSCs) are more apt for marginalization-based MFT, as they are both decorrelated and spectrally local. Our results show that SSCs as features produce a more robust marginalization-based MFT, diagonal-covariance GMM-based, Automatic Speaker Identification (ASI) system than LSSEs as features, for at all tested SNR values (with Additive White Gaussian Noise (AWGN)). It is also shown that a fully-connected Deep Neural Network (DNN) can accurately estimate the Ideal Binary Mask (IBM) used for MFT.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.