Reconstruction techniques for improving the perceptual quality of binary masked speech.

Donald S Williamson,Deliang Wang,Yuxuan Wang

doi:10.1121/1.4884759

Donald S Williamson, Deliang Wang + Show 1 more

Open Access

https://doi.org/10.1121/1.4884759

Copy DOI

Abstract

This study proposes an approach to improve the perceptual quality of speech separated by binary masking through the use of reconstruction in the time-frequency domain. Non-negative matrix factorization and sparse reconstruction approaches are investigated, both using a linear combination of basis vectors to represent a signal. In this approach, the short-time Fourier transform (STFT) of separated speech is represented as a linear combination of STFTs from a clean speech dictionary. Binary masking for separation is performed using deep neural networks or Bayesian classifiers. The perceptual evaluation of speech quality, which is a standard objective speech quality measure, is used to evaluate the performance of the proposed approach. The results show that the proposed techniques improve the perceptual quality of binary masked speech, and outperform traditional time-frequency reconstruction approaches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Reconstruction techniques for improving the perceptual quality of binary masked speech.

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America

Lead the way for us

Journal: The Journal of the Acoustical Society of America	Publication Date: Aug 1, 2014
Citations: 38

Similar Papers

A sparse representation approach for perceptual quality improvement of separated speech
Donald S Williamson ... Yuxuan Wang
-
Donald S Williamson, et. al.Donald S Williamson ... Yuxuan Wang
01 May 2013
01 May 2013

Estimating nonnegative matrix model activations with deep neural networks to increase perceptual speech quality.
Donald S Williamson ... Yuxuan Wang
The Journal of the Acoustical Society of America | VOL. 138
Donald S Williamson, et. al.Donald S Williamson ... Yuxuan Wang
01 Sep 2015
The Journal of the Acoustical Society of America | VOL. 138

An Improved Logistic Function for Mapping Raw Scores of Perceptual Evaluation of Speech Quality (PESQ)
A Olatubosun ... Patrick O Olabisi
Journal of Engineering Research and Reports | VOL. 3
A Olatubosun, et. al.A Olatubosun ... Patrick O Olabisi
24 Nov 2018
Journal of Engineering Research and Reports | VOL. 3

Performance analysis of neural network, NMF and statistical approaches for speech enhancement
Ravi Kumar Kandagatla ... Venkata Subbaiah Potluri
International Journal of Speech Technology | VOL. 23
Ravi Kumar Kandagatla, et. al.Ravi Kumar Kandagatla ... Venkata Subbaiah Potluri
17 Sep 2020
International Journal of Speech Technology | VOL. 23

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Reconstruction techniques for improving the perceptual quality of binary masked speech.

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America