Enhancing Target Speech Based on Nonlinear Soft Masking Using a Single Acoustic Vector Sensor

Yuexian Zou,Zhaoyi Liu,Christian H Ritz

doi:10.3390/app8091436

Abstract

Enhancing speech captured by distant microphones is a challenging task. In this study, we investigate the multichannel signal properties of the single acoustic vector sensor (AVS) to obtain the inter-sensor data ratio (ISDR) model in the time-frequency (TF) domain. Then, the monotone functions describing the relationship between the ISDRs and the direction of arrival (DOA) of the target speaker are derived. For the target speech enhancement (SE) task, the DOA of the target speaker is given, and the ISDRs are calculated. Hence, the TF components dominated by the target speech are extracted with high probability using the established monotone functions, and then, a nonlinear soft mask of the target speech is generated. As a result, a masking-based speech enhancement method is developed, which is termed the AVS-SMASK method. Extensive experiments with simulated data and recorded data have been carried out to validate the effectiveness of our proposed AVS-SMASK method in terms of suppressing spatial speech interferences and reducing the adverse impact of the additive background noise while maintaining less speech distortion. Moreover, our AVS-SMASK method is computationally inexpensive, and the AVS is of a small physical size. These merits are favorable to many applications, such as robot auditory systems.

Highlights

With the development of information technology, intelligent service robots will play an important role in smart home systems
The speech sources taken from the Institute of Electrical and Electronic Engineers (IEEE) speech corpus [26] are placed in the front of the acoustic vector sensor (AVS) at a distance of one meter, and the signal-to-interference ratio (SIR)-input is set to 0 dB, while the signal-to-noise ratio (SNR)-input is set to 10 dB, and the sampling rate was 48 kHz, and down-sampled to 16 kHz for processing
A nonlinear soft mask has been designed by making use of speech time-frequency (TF) sparsity with the known direction of arrival (DOA) of the target speaker

Summary

Introduction

With the development of information technology, intelligent service robots will play an important role in smart home systems. Auditory perception is one of the key technologies of intelligent service robots [1]. It is clear that service robots are always working in noisy environments, and there are possible directional spatial interferences such as the competing speakers located in different locations, air conditioners, and so on. Additive background noise and spatial interferences significantly deteriorate the quality and intelligibility of the target speech, and speech enhancement (SE) is considered the most important preprocessing technique for speech applications such as automatic speech recognition [5]. The well-known single channel SE methods, including spectral subtraction, Wiener filtering, and their variations, are successful for suppressing additive background noise, but they are not able to suppress spatial interferences effectively [6].

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied Sciences	Publication Date: Aug 23, 2018
Citations: 5	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Enhancing Target Speech Based on Nonlinear Soft Masking Using a Single Acoustic Vector Sensor

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

An effective target speech enhancement with single acoustic vector sensor based on the speech time-frequency sparsity
Y X Zou ... C H Ritz
-
Y X Zou, et. al.Y X Zou ... C H Ritz
01 Aug 2014
01 Aug 2014

An effective doa estimation by exploring the spatial sparse representation of the inter-sensor data ratio model
Yuexian Zou ... Jiangtao Xi
-
Yuexian Zou, et. al.Yuexian Zou ... Jiangtao Xi
01 Jul 2014
01 Jul 2014

Higher order statistics based Direction of Arrival estimation with single Acoustic Vector Sensor in the under-determined case
Ashish Agarwal ... Arun Kumar
-
Ashish Agarwal, et. al.Ashish Agarwal ... Arun Kumar
01 Oct 2015
01 Oct 2015

Speech enhancement with an acoustic vector sensor: an effective adaptive beamforming and post-filtering approach
Yue Xian Zou ... Jiangtao Xi
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2014
Yue Xian Zou, et. al.Yue Xian Zou ... Jiangtao Xi
27 Apr 2014
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Enhancing Target Speech Based on Nonlinear Soft Masking Using a Single Acoustic Vector Sensor

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences