Abstract

In this paper, music genre classification is performed using an approach which converts audio signals into spectrograms and Mel-spectrograms. These spectrograms are treated as texture images from which the following features are extracted: Local Binary Pattern (LBP), uniform Local Binary Pattern (uLBP) and Rotation Invariant LBP (RILBP). The LBP and RILBP features are extracted for having eight equally spaced neighbors and having a radius of one or two but for uLBP, features are extracted using the above parameters and also 16 neighbors and radius of two. Support Vector Machines (SVM) are used as classifiers and its multi-class implementation is used to classify a subset of five genres from GTZAN database namely classical, rock, disco, pop and hip-hop. The experiments resulted in a maximum recognition rate of 84% using spectrogram. The use of Mel-spectrogram to extract LBP, uLBP and RILBP features is novel and has resulted in a maximum recognition rate of 78%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call