Recursive Feature Diversity Network for audio super-resolution

Jiahuan Wang,Bo Jiang,Guangming Lu,Mixiao Hou,Yao Lu,David Zhang

doi:10.1016/j.specom.2022.08.005

Abstract

Deep learning methods have been successfully applied to audio super-resolution tasks. Although deep learning methods produce good performance, they are not practical for the real-world applications due to the large member of computations. To address this problem, we propose a Recursive Feature Diversity Networks (RFD-Nets), which is a lightweight model for achieving fast and accurate audio super-resolution. RFD-Nets are composed of a Recursive Feature Diversity (RFD) block and a Back-Projection (BP) block. Specifically, the RFD block is a recursive structure to iteratively refine and extract hierarchical audio feature. Subsequently, using an up-and-down sampling learner, the proposed BP block can effectively capture the deep relationships between High-Resolution (HR) and Low-Resolution (LR) audio pairs, thus producing high-quality audio reconstruction. Furthermore, we collect seven different types of complex audio datasets for training and comprehensively evaluating the proposed method. Extensive experiments demonstrate that our RFD-Nets can achieve superior accuracy on the proposed benchmark datasets against state-of-the-art methods while only requiring lower computation and memory. Datasets are released at https://github.com/JiangBoCS/RFD-Net.

Full Text