High-Quality and Reproducible Automatic Drum Transcription from Crowdsourced Data

Mickaël Zehren,Marco Alunno,Paolo Bientinesi

doi:10.3390/signals4040042

Mickaël Zehren, Marco Alunno + Show 1 more

Open Access

https://doi.org/10.3390/signals4040042

Copy DOI

Journal: Signals	Publication Date: Nov 10, 2023
Citations: 1	License type: CC BY 4.0

Affiliation: Umeå University, Universidad EAFIT

Abstract

Within the broad problem known as automatic music transcription, we considered the specific task of automatic drum transcription (ADT). This is a complex task that has recently shown significant advances thanks to deep learning (DL) techniques. Most notably, massive amounts of labeled data obtained from crowds of annotators have made it possible to implement large-scale supervised learning architectures for ADT. In this study, we explored the untapped potential of these new datasets by addressing three key points: First, we reviewed recent trends in DL architectures and focused on two techniques, self-attention mechanisms and tatum-synchronous convolutions. Then, to mitigate the noise and bias that are inherent in crowdsourced data, we extended the training data with additional annotations. Finally, to quantify the potential of the data, we compared many training scenarios by combining up to six different datasets, including zero-shot evaluations. Our findings revealed that crowdsourced datasets outperform previously utilized datasets, and regardless of the DL architecture employed, they are sufficient in size and quality to train accurate models. By fully exploiting this data source, our models produced high-quality drum transcriptions, achieving state-of-the-art results. Thanks to this accuracy, our work can be more successfully used by musicians (e.g., to learn new musical pieces by reading, or to convert their performances to MIDI) and researchers in music information retrieval (e.g., to retrieve information from the notes instead of audio, such as the rhythm or structure of a piece).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

High-Quality and Reproducible Automatic Drum Transcription from Crowdsourced Data

Abstract

Talk to us

Similar Papers

More From: Signals

Lead the way for us

Similar Papers

Deep Learning Long Short-Term Memory based Automatic Music Transcription System for Carnatic Music
B S Gowrishankar ... Nagappa U Bhajantri
-
B S Gowrishankar, et. al.B S Gowrishankar ... Nagappa U Bhajantri
23 Apr 2022
23 Apr 2022

Segregating Musical Chords for Automatic Music Transcription: A LSTM-RNN Approach
Himadri Mukherjee ... Santanu Phadikar
-
Himadri Mukherjee, et. al.Himadri Mukherjee ... Santanu Phadikar
01 Jan 2019
01 Jan 2019

Automatic Transcription of Polyphonic Music Based on the Constant-Q Bispectral Analysis
Fabrizio Argenti ... Gianni Pantaleo
IEEE Transactions on Audio, Speech, and Language Processing | VOL. 19
Fabrizio Argenti, et. al.Fabrizio Argenti ... Gianni Pantaleo
01 Aug 2011
IEEE Transactions on Audio, Speech, and Language Processing | VOL. 19

Comparative Analysis of Deep Learning Architectures and Vision Transformers for Musical Key Estimation
Manav Garg ... Madhu Shukla
Information | VOL. 14
Manav Garg, et. al.Manav Garg ... Madhu Shukla
28 Sep 2023
Information | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

High-Quality and Reproducible Automatic Drum Transcription from Crowdsourced Data

Abstract

Talk to us

Similar Papers

More From: Signals