Harmonic Representation for CNN-LSTM Automatic Chord Recognition

Tsuyoshi Ito,Shuichi Arai

doi:10.1109/icoris52787.2021.9649565

Abstract

Paper Since Chord progression is the element that determines the harmony of a piece of music, Automatic Chord Recognition (ACR) from audio signals is a crucial task in the field of Music Information Retrieval(MIR). Recently, various models using deep learning have been proposed, but there are few studies on their input features. Notes parts of the chord are the fundamental note, and its overtone ringed simultaneously. In order to model these audio signals efficiently, feature transforms such as “Constat-Q-Transform(CQT)” is used. However, due to the superposition of fundamental notes and overtones of various instruments in polyphonic music, it is considered difficult to model chords even by deep learning. Therefore, we focused on the structure, including fundamental notes are on the logarithm and its overtones are on the linear. In this paper, we propose a feature representation that can represent overtone structure for each fundamental note. Based on these feature representations, data-driven approach to learn the chord by CNN-LSTM model. We evaluated performance using 383 songs with publicly available annotations, and achieved the same performance with approximately one-tenth of the number of parameters than the existing methods.

Full Text