Guzheng tune progression involves intricately harmonizing melodic motif transitions. Effectively navigating this vast creative possibility space to expose musically consistent elaborations presents challenges. We develop a specialized large long short-term memory (LSTM) model for generating musically consistent Guzheng tune transitions. First, we propose novel firefly algorithm (FA) enhancements, e.g., adaptive diversity preservation and adaptive swim parameters, to boost exploration effectiveness for navigating the vast creative combinatorics when generating Guzheng tune transitions. Then, we develop a specialized stacked LSTM architecture incorporating residual connections and conditioned embedding vectors that can leverage long-range temporal dependencies in Guzheng music patterns, including unsupervised learning of concise Guzheng-specific melody embedding vectors via a variational autoencoder, encapsulating unique harmonic signatures from performance descriptors to provide style guidance. Finally, we use LSTM networks to develop adversarial generative large models that enable realistic synthesis and evaluation of Guzheng tunes switching. We gather an extensive 10+ hour corpus of solo Guzheng recordings spanning 230 musical pieces, 130 distinguished performing artists, and 600+ audio tracks. Simultaneously, we conduct thorough Guzheng data analysis. Comparative assessments against strong baselines over systematic musical metrics and professional listeners validate significant generation fidelity improvements. Our model achieves a 63 % reduction in reconstruction error compared to the standard FA optimization after 1000 iterations. It also outperforms baselines in capturing characteristic motifs, maintaining modality coherence with under 2 % dissonant pitch errors, and retaining desired rhythmic cadences. User studies further confirm the superior naturalness, novelty, and stylistic faithfulness of the generated tune transitions, with ratings close to real data.
Read full abstract