Abstract

A challenging open question in music classification is which music representation (i.e., audio features) and which machine learning algorithm is appropriate for a specific music classification task. To address this challenge, given a number of audio feature vectors for each training music recording that capture the different aspects of music (i.e., timbre, harmony, etc.), the goal is to find a set of linear mappings from several feature spaces to the semantic space spanned by the class indicator vectors. These mappings should reveal the common latent variables, which characterize a given set of classes and simultaneously define a multi-class linear classifier that classifies the extracted latent common features. Such a set of mappings is obtained, building on the notion of the maximum margin matrix factorization, by minimizing a weighted sum of nuclear norms. Since the nuclear norm imposes rank constraints to the learnt mappings, the proposed method is referred to as low-rank semantic mappings (LRSMs). The performance of the LRSMs in music genre, mood, and multi-label classification is assessed by conducting extensive experiments on seven manually annotated benchmark datasets. The reported experimental results demonstrate the superiority of the LRSMs over the classifiers that are compared to. Furthermore, the best reported classification results are comparable with or slightly superior to those obtained by the state-of-the-art task-specific music classification methods.

Highlights

  • Retail and online music stores usually index their collections by artist or album name

  • The only exceptions are the prediction of valence on the MTV dataset, where the best classification accuracy is achieved by the sparse representationbased classifier (SRC), and the music genre classification accuracy on the Unique dataset, where the top performance is achieved by the support vector machines (SVMs)

  • 6 Conclusions The low-rank semantic mappings (LRSMs) have been proposed as a general-purpose music classification method

Read more

Summary

Introduction

Retail and online music stores usually index their collections by artist or album name. At the machine learning stage, music genre and mood classification are treated as singlelabel multi-class classification problems To this end, support vector machines (SVMs) [23], nearest-neighbor (NN) classifiers, Gaussian mixture model-based ones [3], and classifiers relying on sparse and low-rank representations [24] have been employed to classify the audio features into genre or mood classes. A novel, robust, general-purpose music classification method is proposed to address the aforementioned challenge It is suitable for both single-label (i.e., genre or mood classification) and multi-label (i.e., music tagging) multi-class classification problems, providing a systematic way to handle multiple audio features capturing the different aspects of music.

Notations
Classification by low-rank semantic mappings
Computational complexity
Experimental evaluation
Method Features
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call