Building Indonesian Music Dataset: Collection and Analysis

M Octaviano Pratama,Ermatita Ermatita,Pamela Kareen

doi:10.1109/icimcis53775.2021.9699332

Abstract

We introduce The Indonesian Music Dataset (IMD), a collection of audio features and text lyrics features for thousand Indonesian popular songs which has been developed for automatic music era classification and other classification tasks. Dataset collection consists of audio features represented by Spectrogram, Chroma Feature and Low-level audio features. The dataset also consists of lyric features in order to support multimodal tasks. Dataset is equipped with eras (year of publication) labels starting from '70 until the current era, mood labels from Valence-Arousal (Anger, Sadness, Happiness and Relax), and genre labels (Rock, Pop, Jazz). In this paper, we also present era, mood and genre prediction as an example of a dataset experiment for each modality (audio features and text lyrics features) that shows positive results using benchmarking models.

Full Text