Divide and Distill: New Outlooks on Knowledge Distillation for Environmental Sound Classification

Achyut Mani Tripathi,Om Jee Pandey

doi:10.1109/taslp.2023.3244507

Abstract

Environmental sound classification (ESC) is an important research problem with a broad range of applications including audio-based surveillance, audio-visual systems, smart homes, and robotics, among others. The recently proposed vision multi-layer perceptron-mixer (MLP-mixer) has outperformed traditional deep models (CNN or ResNet) and attained new state-of-the-art performances for several computer vision applications (image/video classification and image segmentation). Following the success of MLP-mixer, in this paper, we propose a novel audio MLP-mixer (AMM) network that classifies the different types of environmental sounds. Despite the higher performance, the high computational cost (number of trainable parameters and floating point operations) prohibits deployment of the AMM model on edge for designing real-life applications. To alleviate the aforementioned issue, in this work, we present three different knowledge distillation (KD) strategies to train a compact deep network for ESC. The proposed strategies divide the input Mel-spectrogram into patches and a lightweight deep ESC model is trained in the presence of three teacher networks under the offline KD training framework. Additionally, we have designed two novel loss functions for KD that are free from a temperature parameter that need to be set manually by a user as in the case of the traditional vanilla KD technique. We conducted our experiments on three benchmark ESC datasets namely ESC-10, Urbansound8k (US8K), and DCASE-2019 Task-1(A). The obtained results demonstrate the significance of utilization of proposed methods over other existing KD methods in terms of classification accuracy.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Divide and Distill: New Outlooks on Knowledge Distillation for Environmental Sound Classification

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing

Lead the way for us

Journal: IEEE/ACM Transactions on Audio, Speech, and Language Processing	Publication Date: Jan 1, 2023
Citations: 7

Similar Papers

Data augmentation guided knowledge distillation for environmental sound classification
Achyut Mani Tripathi ... Konark Paul
Neurocomputing | VOL. 489
Achyut Mani Tripathi, et. al.Achyut Mani Tripathi ... Konark Paul
15 Mar 2022
Neurocomputing | VOL. 489

Attention Based Convolutional Recurrent Neural Network for Environmental Sound Classification
Zhichao Zhang ... Tianhao Qiao
-
Zhichao Zhang, et. al.Zhichao Zhang ... Tianhao Qiao
01 Jan 2019
01 Jan 2019

Environmental Sound Classification Based on Knowledge Distillation
Qianjin Cui ... Xiaoman Wang
-
Qianjin Cui, et. al.Qianjin Cui ... Xiaoman Wang
21 Oct 2022
21 Oct 2022

A Method of Environmental Sound Classification Based on Residual Networks and Data Augmentation
Jinfang Zeng ... Youming Li
International Journal of Computational Intelligence and Applications | VOL. 20
Jinfang Zeng, et. al.Jinfang Zeng ... Youming Li
13 Aug 2021
International Journal of Computational Intelligence and Applications | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Divide and Distill: New Outlooks on Knowledge Distillation for Environmental Sound Classification

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing