Automatic music generation is the combination of artificial intelligence and art, in which melody harmonization is a significant and challenging task. However, previous recurrent neural network (RNN)-based work fails to maintain long-term dependency and neglects the guidance of music theory. In this article, we first devise a universal chord representation with a fixed small dimension, which can cover most existing chords and is easy to expand. Then a novel melody harmonization system based on reinforcement learning (RL), RL-Chord, is proposed to generate high-quality chord progressions. Specifically, a melody conditional LSTM (CLSTM) model is put forward that learns the transition and duration of chords well, based on which RL algorithms with three well-designed reward modules are combined to construct RL-Chord. We compare three widely used RL algorithms (i.e., policy gradient, Q -learning, and actor-critic algorithms) on the melody harmonization task for the first time and prove the superiority of deep Q -network (DQN). Furthermore, a style classifier is devised to fine-tune the pretrained DQN-Chord for zero-shot Chinese folk (CF) melody harmonization. Experimental results demonstrate that the proposed model can generate harmonious and fluent chord progressions for diverse melodies. Quantitatively, DQN-Chord achieves better performance than the compared methods on multiple evaluation metrics, such as chord histogram similarity (CHS), chord tonal distance (CTD), and melody-chord tonal distance (MCTD).
Read full abstract