Comparison of Pre-Trained CNNs for Audio Classification Using Transfer Learning

Eleni Tsalera,Andreas Papadakis,Maria Samarakou

doi:10.3390/jsan10040072

Abstract

The paper investigates retraining options and the performance of pre-trained Convolutional Neural Networks (CNNs) for sound classification. CNNs were initially designed for image classification and recognition, and, at a second phase, they extended towards sound classification. Transfer learning is a promising paradigm, retraining already trained networks upon different datasets. We selected three ‘Image’- and two ‘Sound’-trained CNNs, namely, GoogLeNet, SqueezeNet, ShuffleNet, VGGish, and YAMNet, and applied transfer learning. We explored the influence of key retraining parameters, including the optimizer, the mini-batch size, the learning rate, and the number of epochs, on the classification accuracy and the processing time needed in terms of sound preprocessing for the preparation of the scalograms and spectrograms as well as CNN training. The UrbanSound8K, ESC-10, and Air Compressor open sound datasets were employed. Using a two-fold criterion based on classification accuracy and time needed, we selected the ‘champion’ transfer-learning parameter combinations, discussed the consistency of the classification results, and explored possible benefits from fusing the classification estimations. The Sound CNNs achieved better classification accuracy, reaching an average of 96.4% for UrbanSound8K, 91.25% for ESC-10, and 100% for the Air Compressor dataset.

Highlights

Sound is a complex, feature-rich signal, and sound classification has attracted research interest using a rich portfolio of Machine Learning (ML) methodologies and mechanisms
(3) We evaluated the transfer learning upon the selected Convolutional Neural Networks (CNNs), in terms of classification accuracy and resources needed for learning, using three publicly available sound datasets (UrbanSound8K, ESC-10, and Air Compressor)
The evaluation was based on the classification accuracy (CA) and the training time (TT)

Summary

Introduction

Feature-rich signal, and sound classification has attracted research interest using a rich portfolio of Machine Learning (ML) methodologies and mechanisms. Such mechanisms include classic (‘traditional’) ML methods (such as Support Vector Machine, Linear Discriminant Analysis) as well as deep learning (notably Convolutional Neural Networks (CNNs)). Especially CNNs, have achieved significant results in recognition and classification tasks, especially related to image recognition. A difficulty in using CNN is the need for extensive computational resources, especially during training. The expanding architectures of CNNs (in terms of layers) further contribute to this need

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Sensor and Actuator Networks	Publication Date: Dec 10, 2021
Citations: 51	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Comparison of Pre-Trained CNNs for Audio Classification Using Transfer Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Sensor and Actuator Networks

Lead the way for us

Similar Papers

An Improved Convolutional Neural Network Model for DNA Classification
Naglaa F Soliman ... El-Sayed M El-Rabaie
Computers, Materials & Continua | VOL. 70
Naglaa F Soliman, et. al.Naglaa F Soliman ... El-Sayed M El-Rabaie
01 Jan 2021
Computers, Materials & Continua | VOL. 70

The analysis and optimization of CNN Hyperparameters with fuzzy tree modelfor image classification
Kübra Uyar ... Şaki̇r Taşdemi̇r
Turkish Journal of Electrical Engineering and Computer Sciences | VOL. 30
Kübra Uyar, et. al.Kübra Uyar ... Şaki̇r Taşdemi̇r
01 Mar 2022
Turkish Journal of Electrical Engineering and Computer Sciences | VOL. 30

An image augmentation method using convolutional network for thyroid nodule classification by transfer learning
Ye Zhu ... Jian Fei
-
Ye Zhu, et. al.Ye Zhu ... Jian Fei
01 Dec 2017
01 Dec 2017

Environmental Sound Classification Based on Transfer-Learning Techniques with Multiple Optimizers
Asadulla Ashurov ... Yi Zhou
Electronics | VOL. 11
Asadulla Ashurov, et. al.Asadulla Ashurov ... Yi Zhou
22 Jul 2022
Electronics | VOL. 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Comparison of Pre-Trained CNNs for Audio Classification Using Transfer Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Sensor and Actuator Networks