A dual-region speech enhancement method based on voiceprint segmentation

Yang Li,Wei-Tao Zhang,Shun-Tian Lou

doi:10.1016/j.neunet.2024.106683

Abstract

Single-channel speech enhancement primarily relies on deep learning models to recover clean speech signals from noise-contaminated speech. These models establish a mapping relationship between noisy and clean speech. However, considering the sparse distribution characteristics of speech energy across the entire time–frequency spectrogram, constructing the mapping relationship from noisy to clean speech exhibits significant differences in regions where speech energy is concentrated and non-concentrated. Utilizing one deep model to simultaneously address these two distinct regression tasks increases the complexity of the mapping relationships, consequently restricting the model’s performance. To validate our hypothesis, we propose a dual-region speech enhancement model based on voiceprint region segmentation. Specifically, we first train a voiceprint segmentation model to classify noisy speech into two regions. Subsequently, we establish dedicated speech enhancement models for each region, with the dual-region models concurrently constructing mapping relationships for noise-corrupted speech to clean speech in distinct regions. Finally, by merging the results, the complete restored speech can be obtained. Experimental results on public datasets demonstrate that our method achieves competitive speech enhancement performance, outperforming the state-of-the-art. Ablation study results confirm the effectiveness of the proposed approach in enhancing model performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A dual-region speech enhancement method based on voiceprint segmentation

Abstract

Talk to us

Similar Papers

More From: Neural Networks

Lead the way for us

Similar Papers

Kalman Filtering with Machine Learning Methods for Speech Enhancement

-

04 May 2021
04 May 2021

Deep Learning for Minimum Mean-Square Error and Missing Data Approaches to Robust Speech Processing

-

04 Dec 2020
04 Dec 2020

Single-channel Speech Enhancement Student under Multi-channel Speech Enhancement Teacher
Yuzhu Zhang ... Xueliang Zhang
-
Yuzhu Zhang, et. al.Yuzhu Zhang ... Xueliang Zhang
07 Nov 2022
07 Nov 2022

Human Listening and Live Captioning: Multi-Task Training for Speech Enhancement
Sefik Emre Eskimez ... Hemin Yang
-
Sefik Emre Eskimez, et. al.Sefik Emre Eskimez ... Hemin Yang
30 Aug 2021
30 Aug 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A dual-region speech enhancement method based on voiceprint segmentation

Abstract

Talk to us

Similar Papers

More From: Neural Networks