Speech Separation Algorithm Research Articles

In the field of speech separation, the traditional single-channel and multi-channel speech separation methods have made great progress. However, the accuracy of separation and automatic speech recognition(ASR) rate are not yet satisfactory. With the development of neural networks, some scholars began to use deep learning to achieve speech separation. Although this kind of method improves the accuracy of speech separation, it also leads to the need for pre-training the model, higher computational complexity and reduced separation performance when the model does not match the mixed signal. This paper has conducted an in-depth study on the scene of multi-speaker separation, and proposed a new dual-channel speech separation algorithm based on the Comb-Filter Effect (CFE). The CFE is an effect that occurs when a signal passes through a first-order differential microphone(FDM) array. And this effect is discovered and exploited for the first time. By using this effect, this paper designed a new signal spectrum estimation method that can realize accurate estimation of speech signal, and combined this method with traditional spectral subtraction to achieve the purpose of speech separation. Finally, this paper compared the proposed algorithm with the traditional FastICA-based algorithm and the fully-convolutional time-domain audio separation network(Conv-TasNet)-based algorithm. The results of simulation and comparison experiments show that the algorithm can effectively separate two-way speech signals while greatly reducing the computational complexity and has excellent robustness. In various situations, the proposed algorithm can obtain the Scale-Invariant Source-to-Noise Ratio improvement (SI-SNRi) of 9.19 dB on average. In addition, the Short-Time Objective Intelligibility (STOI) and Perceptual Evaluation of Speech Quality (PESQ) of the speech signal can be improved by an average of 33% and 70% or more respectively.

Read full abstract

Objective. People who suffer from hearing impairments can find it difficult to follow a conversation in a multi-speaker environment. Current hearing aids can suppress background noise; however, there is little that can be done to help a user attend to a single conversation amongst many without knowing which speaker the user is attending to. Cognitively controlled hearing aids that use auditory attention decoding (AAD) methods are the next step in offering help. Translating the successes in AAD research to real-world applications poses a number of challenges, including the lack of access to the clean sound sources in the environment with which to compare with the neural signals. We propose a novel framework that combines single-channel speech separation algorithms with AAD. Approach. We present an end-to-end system that (1) receives a single audio channel containing a mixture of speakers that is heard by a listener along with the listener’s neural signals, (2) automatically separates the individual speakers in the mixture, (3) determines the attended speaker, and (4) amplifies the attended speaker’s voice to assist the listener. Main results. Using invasive electrophysiology recordings, we identified the regions of the auditory cortex that contribute to AAD. Given appropriate electrode locations, our system is able to decode the attention of subjects and amplify the attended speaker using only the mixed audio. Our quality assessment of the modified audio demonstrates a significant improvement in both subjective and objective speech quality measures. Significance. Our novel framework for AAD bridges the gap between the most recent advancements in speech processing technologies and speech prosthesis research and moves us closer to the development of cognitively controlled hearable devices for the hearing impaired.

Read full abstract

Speech Separation Algorithm Research Articles

Related Topics

Articles published on Speech Separation Algorithm

End-to-end integration of speech separation and voice activity detection for low-latency diarization of telephone conversations

Effective Monoaural Speech Separation through Convolutional Top-Down Multi-View Network

NNMF with Speaker Clustering in a Uniform Filter-Bank for Blind Speech Separation

A speech separation algorithm based on the comb-filter effect

Speech Separation Algorithm Using Gated Recurrent Network Based on Microphone Array

Enhancing the correlation between the quality and intelligibility objective metrics with the subjective scores by shallow feed forward neural network for time–frequency masking speech separation algorithms

Integration of deep learning with expectation maximization for spatial cue-based speech separation in reverberant conditions

Blind separation of underdetermined Convolutive speech mixtures by time–frequency masking with the reduction of musical noise of separated signals

Microphone Array Speech Separation Algorithm Based on TC-ResNet

An Improved Unsupervised Single‐Channel Speech Separation Algorithm for Processing Speech Sensor Signals

Speech separation based on reliable binaural cues with two-stage neural network in noisy-reverberant environments

Adaptive Speech Separation Based on Beamforming and Frequency Domain-Independent Component Analysis

Graph Convolution-Based Deep Clustering for Speech Separation

Evaluating Multi-Channel Multi-Device Speech Separation Algorithms in the Wild: A Hardware-Software Solution

Binaural Speech Separation Algorithm Based on Long and Short Time Memory Networks

Speaker-independent auditory attention decoding without access to clean speech sources.

Research on speech separation technology based on deep learning

Single-channel speech separation using combined EMD and speech-specific information

Neural decoding of attentional selection in multi-speaker environments without access to clean sources

Neural decoding of attentional selection in multi-speaker environments without access to separated sources.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Speech Separation Algorithm Research Articles

Related Topics

Articles published on Speech Separation Algorithm

End-to-end integration of speech separation and voice activity detection for low-latency diarization of telephone conversations

Effective Monoaural Speech Separation through Convolutional Top-Down Multi-View Network

NNMF with Speaker Clustering in a Uniform Filter-Bank for Blind Speech Separation

A speech separation algorithm based on the comb-filter effect

Speech Separation Algorithm Using Gated Recurrent Network Based on Microphone Array

Enhancing the correlation between the quality and intelligibility objective metrics with the subjective scores by shallow feed forward neural network for time–frequency masking speech separation algorithms

Integration of deep learning with expectation maximization for spatial cue-based speech separation in reverberant conditions

Blind separation of underdetermined Convolutive speech mixtures by time–frequency masking with the reduction of musical noise of separated signals

Microphone Array Speech Separation Algorithm Based on TC-ResNet

An Improved Unsupervised Single‐Channel Speech Separation Algorithm for Processing Speech Sensor Signals

Speech separation based on reliable binaural cues with two-stage neural network in noisy-reverberant environments

Adaptive Speech Separation Based on Beamforming and Frequency Domain-Independent Component Analysis

Graph Convolution-Based Deep Clustering for Speech Separation

Evaluating Multi-Channel Multi-Device Speech Separation Algorithms in the Wild: A Hardware-Software Solution

Binaural Speech Separation Algorithm Based on Long and Short Time Memory Networks

Speaker-independent auditory attention decoding without access to clean speech sources.

Research on speech separation technology based on deep learning

Single-channel speech separation using combined EMD and speech-specific information

Neural decoding of attentional selection in multi-speaker environments without access to clean sources

Neural decoding of attentional selection in multi-speaker environments without access to separated sources.