Abstract

This paper presents a novel approach to detecting onsets in music audio files. We use a supervised learning algorithm to classify spectrogram frames extracted from digital audio as being onsets or nononsets. Frames classified as onsets are then treated with a simple peak-picking algorithm based on amoving average.We present two versions of this approach. The first version uses a single neural network classifier. The second version combines the predictions of several networks trained using different hyperparameters. We describe the details of the algorithm and summarize the performance of both variants on several datasets.We also examine our choice of hyperparameters by describing results of cross-validation experiments done on a custom dataset. We conclude that a supervised learning approach to note onset detection performs well and warrants further investigation.

Highlights

  • This paper is concerned with finding the onset times of notes in music audio

  • We focused on the short-time Fourier transform (STFT) and the constant-Q transform [13]

  • We have presented an algorithm that adds a supervised learning step to the basic onset detection framework of signal transformation, feature enhancement, and peak picking

Read more

Summary

Introduction

This paper is concerned with finding the onset times of notes in music audio. Though conceptually simple, this task is deceivingly difficult to perform automatically with a computer. For example, the naıve approach of finding amplitude peaks in the raw waveform. This strategy fails except for trivially easy cases such as monophonic percussive instruments. Onset detection is implicated in a number of important music information retrieval (MIR) tasks, and warrants research. In the case of classification, onset locations could be used to significantly reduce the number of frame-level features retained. In the case of music fingerprinting, onset times could be used as the basis of a robust fingerprint vector

Objectives
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.