A Repetitive Spectrum Learning Framework for Monaural Speech Enhancement in Extremely Low SNR Environments (Student Abstract)

Wenxin Tai

doi:10.1609/aaai.v36i11.21668

Abstract

Monaural speech enhancement (SE) at an extremely low signal-to-noise ratio (SNR) condition is a challenging problem and rarely investigated in previous studies. Most SE methods experience failures in this situation due to three major factors: overwhelmed vocals, expanded SNR range, and short-sighted feature processing modules. In this paper, we present a novel and general training paradigm dubbed repetitive learning (RL). Unlike curriculum learning that focuses on learning multiple different tasks sequentially, RL is more inclined to learn the same content repeatedly where the knowledge acquired in previous stages can be used to facilitate calibrating feature representations. We further propose an RL-based end-to-end SE method named SERL. Experimental results on TIMIT dataset validate the superior performance of our method.

Full Text