Abstract

Genres are one of the key features that categorize music based on specific series of patterns. However, the Arabic music content on the web is poorly defined into its genres, making the automatic classification of Arabic audio genres challenging. For this reason, in this research, our objective is first to construct a well-annotated dataset of five of the most well-known Arabic music genres, which are: Eastern Takht, Rai, Muwashshah, the poem, and Mawwal, and finally present a comprehensive empirical comparison of deep Convolutional Neural Networks (CNNs) architectures on Arabic music genres classification. In this work, to utilize CNNs to develop a practical classification system, the audio data is transformed into a visual representation (spectrogram) using Short Time Fast Fourier Transformation (STFT), then several audio features are extracted using Mel Frequency Cepstral Coefficients (MFCC). Performance evaluation of classifiers is measured with the accuracy score, time to build, and Matthew’s correlation coefficient (MCC). The concluded results demonstrated that AlexNet is considered among the top-performing five CNNs classifiers studied: LeNet5, AlexNet, VGG, ResNet-50, and LSTM-CNN, with an overall accuracy of 96%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call