Music Genre Classification Using MIDI and Audio Features

Zehra Cataltepe,Abdullah Sonmez,Yusuf Yaslan

doi:10.1155/2007/36409

Zehra Cataltepe, Abdullah Sonmez + Show 1 more

Open Access

https://doi.org/10.1155/2007/36409

Copy DOI

Abstract

We report our findings on using MIDI files and audio features from MIDI, separately and combined together, for MIDI music genre classification. We use McKay and Fujinaga's 3-root and 9-leaf genre data set. In order to compute distances between MIDI pieces, we use normalized compression distance (NCD). NCD uses the compressed length of a string as an approximation to its Kolmogorov complexity and has previously been used for music genre and composer clustering. We convert the MIDI pieces to audio and then use the audio features to train different classifiers. MIDI and audio from MIDI classifiers alone achieve much smaller accuracies than those reported by McKay and Fujinaga who used not NCD but a number of domain-based MIDI features for their classification. Combining MIDI and audio from MIDI classifiers improves accuracy and gets closer to, but still worse, accuracies than McKay and Fujinaga's. The best root genre accuracies achieved using MIDI, audio, and combination of them are 0.75, 0.86, and 0.93, respectively, compared to 0.98 of McKay and Fujinaga. Successful classifier combination requires diversity of the base classifiers. We achieve diversity through using certain number of seconds of the MIDI file, different sample rates and sizes for the audio file, and different classification algorithms.

Highlights

The increase of the musical databases on the Internet and multimedia systems have brought a great demand for music information retrieval (MIR) applications and especially automatic analysis of the musical databases
We report our experiments with linear discriminant classifiers (LDC) which assume normal densities and k-nearest neighbor classifiers (KNN)
(vi) Mel-frequency cepstral coefficients (MFCC): MFCCs are well known for speech representation

Summary

INTRODUCTION

The increase of the musical databases on the Internet and multimedia systems have brought a great demand for music information retrieval (MIR) applications and especially automatic analysis of the musical databases. [6, 7] have suggested using an approximation to Kolmogorov distance between two musical pieces as a mean to compute clusters of music They first process the MIDI representation of a music piece to turn it into a string from a finite alphabet. Acoustic music signals are represented using different audio formats, such as VAW, MP3, AAC, or OGG. We use our preprocessing method [16, 17] of MIDI files, compute NCD between them using complearn software (http://www.complearn.org), and k-nearest neighbour classifier to predict root and leaf genre of MIDI files.

CLASSIFIERS

GENRE CLASSIFICATION USING AUDIO FEATURES

Timbral features

Rhythmic content features

Pitch content features

Effect of sample rate and size on genre classification

GENRE CLASSIFICATION USING MIDI AND NCD

GENRE CLASSIFICATION USING BOTH MIDI AND AUDIO FROM MIDI

Findings

CONCLUSIONS

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: EURASIP Journal on Advances in Signal Processing	Publication Date: Jan 4, 2007
Citations: 74	License type: cc-by

R Discovery Prime

R Discovery Prime

Music Genre Classification Using MIDI and Audio Features

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EURASIP Journal on Advances in Signal Processing

Lead the way for us

Similar Papers

Turkish Music Genre Classification using Audio and Lyrics Features
Önder Çoban
Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi | VOL. 21
Önder ÇobanÖnder Çoban
06 May 2017
Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi | VOL. 21

Measuring Plan Diversity: Pathologies in Existing Approaches and A New Plan Distance Metric
Robert Goldman ... Ugur Kuter
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 29
Robert Goldman, et. al.Robert Goldman ... Ugur Kuter
04 Mar 2015
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 29

Assessing the Impact of Student Peer Review in Writing Instruction by Using the Normalized Compression Distance
Sayuri Yoshizawa ... Takao Terano
IEEE Transactions on Professional Communication | VOL. 55
Sayuri Yoshizawa, et. al.Sayuri Yoshizawa ... Takao Terano
01 Mar 2012
IEEE Transactions on Professional Communication | VOL. 55

Gene ontology prediction using compression based distances and alignment scores on both amino acid sequence and secondary structure
Asli Filiz ... Zehra Cataltepe
-
Asli Filiz, et. al.Asli Filiz ... Zehra Cataltepe
01 Oct 2008
01 Oct 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Music Genre Classification Using MIDI and Audio Features

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EURASIP Journal on Advances in Signal Processing