Abstract
Music genre classification is the categorization of a piece of music into its corresponding categorical labels created by humans and has been traditionally performed through a manual process. Automatic music genre classification, a fundamental problem in the musical information retrieval community, has been gaining more attention with advances in the development of the digital music industry. Most current genre classification methods tend to be based on the extraction of short-time features in combination with high-level audio features to perform genre classification. However, the representation of short-time features, using time windows, in a semantic space has received little attention. This paper proposes a vector space model of mel-frequency cepstral coefficients (MFCCs) that can, in turn, be used by a supervised learning schema for music genre classification. Inspired by explicit semantic analysis of textual documents using term frequency-inverse document frequency (tf-idf), a semantic space model is proposed to represent music samples. The effectiveness of this representation of audio samples is then demonstrated in music genre classification using various machine learning classification algorithms, including support vector machines (SVMs) and k-nearest neighbor clustering. Our preliminary results suggest that the proposed method is comparable to genre classification methods that use low-level audio features.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have