Abstract

The quality of speech separation affects the entire speech technology ecosystem. Aiming at the problems of low utilization of local feature information, insufficient convergence speed, too many calculation parameters and too long calculation time in blind source separation in view of Transformer dual-path cyclic neural network, a blind source separation model based on fusion of Conformer and NBC (Narrow-band Conformer NBC) is proposed. First, for the problems of low utilization of speech local feature information and insufficient convergence speed, the Transformer block is replaced by the Conformer in the block of the dual-path recurrent network. It can improve the utilization and convergence speed of local features. Secondly, the NBC block is used to replace the Transformer block in the inter-block loop. The NBC block simplifies the calculation of vector similarity and vector aggregation, reduces lots of parameters and calculation cost, and reduces the computational complexity of the model. In the experiments on WSJ0-2mix[7] dataset and WHAM[13] dataset, contrast with other models, the convergence speech is faster and the blind source separation effect is better.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call