Abstract

A new architecture for melody extraction from polyphonic music is explored in this paper. Specifically, chromagrams are first constructed through the harmonic pitch class profile (HPCP) to measure the salience of melody, and chroma-level notes are tracked by dynamic programming. Then, note detection is performed according to chroma-level note differences between adjacent frames. Next, note pitches are coarsely mapped by maximizing the salience of each note, followed by a fine tuning to fit the dynamic variation within each note. Finally, voicing detection is carried out to determine the presence of melody according to the salience of fine-tuned notes. Note level pitch mapping and fine tuning avoids pitch shifting between different octaves or notes within one note duration. Several experiments have been conducted to evaluate the performance of the proposed method. The experimental results show that the proposed method can track the dynamic pitch changing within each note, and performs well at different signal-to-accompaniment ratios. However, its performance for deep vibratos and pitch glides still needs to be improved.

Highlights

  • Melody, as the essence of music, plays an important role in understanding music semantics and distinguishing different music pieces

  • According to the experimental results provided it can be seen that the proposed new architecture for melody extraction can extract melody from polyphonic music

  • The chroma-level notes are first tracked by dynamic programming

Read more

Summary

Introduction

As the essence of music, plays an important role in understanding music semantics and distinguishing different music pieces. Source separation-based methods employ spectrum decomposition schemes to separate the lead voice from the mixed recordings, estimate and track the pitch sequence of the previously extracted source. Data-driven classification-based methods formulate melody extraction as a classification problem, where pitches are quantitized to some specific levels (such as MIDI pitch number) [8,9]. These methods need few priors, but they often cause quantization errors or over-fitting when the training dataset is small

Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.