Implementation of Real-Time Speech Separation Model Using Time-Domain Audio Separation Network (TasNet) and Dual-Path Recurrent Neural Network (DPRNN)

Alfian Wijayakusuma,Hanry Ham,Davin Reinaldo Gozali,Anthony Widjaja

doi:10.1016/j.procs.2021.01.065

Alfian Wijayakusuma, Hanry Ham + Show 2 more

Open Access

https://doi.org/10.1016/j.procs.2021.01.065

Copy DOI

Journal: Procedia computer science	Publication Date: Jan 1, 2021
Citations: 3	License type: cc-by-nc-nd

Affiliation: Binus University

Abstract

Abstract The purpose of this research is to develop a model that is able to perform real-time speaker independent multi-talker speech separation task in time-domain using Time-Domain Audio Separation Network (TasNet) and Dual-Path Recurrent Neural Network (DPRNN). This research will conduct experiments on some RNN architectures, number of batch size, and optimizers as hyper-parameters in order to implement TasNet and DPRNN. This research also try to analyze the impact of these hyperparameters setup on model performance. The expected result of this research is a more accurate model and lower latency to complete speaker independent multi-talker speech separation task in real-time than previous research model.

Full Text