The Role of CNN and RNN in the Classification of Audio Music Genres

Mohsin Ashraf,Satwat Bashir,Fazeel Abid,Muhammad Atif

doi:10.21015/vtse.v10i2.793

Abstract

This study aims at determining how various types of neural networks can be used to categorize music files. We used the GTZAN dataset that contains several genres of traditional music. Every genre has some common traditions that can be referred to as features. The task of classifying music genres based on features is challenging. Deep neural architectures such as Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) have been considered for music analysis. However, it has been observed that neural architectures are data-intensive and face the problem of overfitting. To address this issue, we present a framework containing CNN and RNN with Long Short Time Memory (LSTM) having multiple layers to categorize the music genres and handle the problem of overfitting. Our experiments also revealed the strengths and limitations of deep learning. Finally, we found CNN to be best among other state-of-the-art models and achieved the training and test accuracies of 86.53 % and 81.90 % respectively.

Full Text