A Comparative Study of IBM and IRM Target Mask for Supervised Malay Speech Separation from Noisy Background

Norezmi Jamal,N Fuad,Shahnoor Shanta,Mnah Sha’Abani

doi:10.1016/j.procs.2020.12.020

Norezmi Jamal, N Fuad + Show 2 more

Open Access

https://doi.org/10.1016/j.procs.2020.12.020

Copy DOI

Abstract

Abstract This paper presents a comparative study of Ideal Binary Mask (IBM) and Ideal Ratio Mask (IRM) as training target for supervised Malay speech separation. Inspired by revolution of powerful computer system, Deep Neural Network (DNN) is used as a supervised algorithm to predict target mask from noisy mixture signal that is degraded by noise background. Although previous works showed IRM is better than IBM target mask with DNN algorithm, but it is incomparable due to different database. To validate DNN model with these target masks, 600 Malay utterances from a male and a female speaker were used in training session while remaining 120 Malay utterances were used in prediction session. The combination of acoustic features such as amplitude modulation spectrogram (AMS), mel-frequency cepstral coefficient (MFCC), relative spectral transformed perceptual linear prediction coefficients (RASTA-PLP) and Gammatone filter bank power spectra (GF) were used as input features to estimate target mask. The performance of intelligibility enhancement was evaluated using Short Time Objective Intelligibility (STOI) score. Average STOI score of IRM target mask indicated up to 0.83 for seen speakers while 0.76 for unseen speakers at -5dB babble noise, which is superior than IBM target mask.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Procedia Computer Science	Publication Date: Jan 1, 2021
Citations: 7	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

A Comparative Study of IBM and IRM Target Mask for Supervised Malay Speech Separation from Noisy Background

Abstract

Talk to us

Similar Papers

More From: Procedia Computer Science

Lead the way for us

Similar Papers

Review of Ideal Binary and Ratio Mask Estimation Techniques for Monaural Speech Separation
T M Minipriya ... R Rajavel
-
T M Minipriya, et. al.T M Minipriya ... R Rajavel
01 Feb 2018
01 Feb 2018

Multi-Task Learning U-Net for Single-Channel Speech Enhancement and Mask-Based Voice Activity Detection
Geon Woo Lee ... Hong Kook Kim
Applied Sciences | VOL. 10
Geon Woo Lee, et. al.Geon Woo Lee ... Hong Kook Kim
06 May 2020
Applied Sciences | VOL. 10

Comparison of ideal mask-based speech enhancement algorithms for speech mixed with white noise at low mixture signal-to-noise ratios
Simone Graetzer ... Carl Hopkins
The Journal of the Acoustical Society of America | VOL. 152
Simone Graetzer, et. al.Simone Graetzer ... Carl Hopkins
01 Dec 2022
The Journal of the Acoustical Society of America | VOL. 152

Impact of Mask Type as Training Target for Speech Intelligibility and Quality in Cochlear-Implant Noise Reduction
Fergal Henry ... Ashkan Parsi
Sensors | VOL. 24
Fergal Henry, et. al.Fergal Henry ... Ashkan Parsi
14 Oct 2024
Sensors | VOL. 24

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Comparative Study of IBM and IRM Target Mask for Supervised Malay Speech Separation from Noisy Background

Abstract

Talk to us

Similar Papers

More From: Procedia Computer Science