DAFE-MSGAT: Dual-Attention Feature Extraction and Multi-Scale Graph Attention Network for Polyphonic Piano Transcription

Rui Cao,Bing Liu,Zushuang Liang,Zheng Yan

doi:10.3390/electronics13193939

Abstract

Automatic music transcription (AMT) aims to convert raw audio signals into symbolic music. This is a highly challenging task in the fields of signal processing and artificial intelligence, and it holds significant application value in music information retrieval (MIR). Existing methods based on convolutional neural networks (CNNs) often fall short in capturing the time-frequency characteristics of audio signals and tend to overlook the interdependencies between notes when processing polyphonic piano with multiple simultaneous notes. To address these issues, we propose a dual attention feature extraction and multi-scale graph attention network (DAFE-MSGAT). Specifically, we design a dual attention feature extraction module (DAFE) to enhance the frequency and time-domain features of the audio signal, and we utilize a long short-term memory network (LSTM) to capture the temporal features within the audio signal. We introduce a multi-scale graph attention network (MSGAT), which leverages the various implicit relationships between notes to enhance the interaction between different notes. Experimental results demonstrate that our model achieves high accuracy in detecting the onset and offset of notes on public datasets. In both frame-level and note-level metrics, DAFE-MSGAT achieves performance comparable to the state-of-the-art methods, showcasing exceptional transcription capabilities.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

DAFE-MSGAT: Dual-Attention Feature Extraction and Multi-Scale Graph Attention Network for Polyphonic Piano Transcription

Abstract

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Journal: Electronics	Publication Date: Oct 5, 2024
License type: CC BY 4.0

Similar Papers

Towards AGI: Cognitive Architecture Based on Hybrid and Bionic Principles
R V Dushkin
-
R V DushkinR V Dushkin
13 Jul 2021
13 Jul 2021

Exploiting Network Science for Feature Extraction and Representation Learning

-

15 Oct 2019
15 Oct 2019

Deep Learning Long Short-Term Memory based Automatic Music Transcription System for Carnatic Music
B S Gowrishankar ... Nagappa U Bhajantri
-
B S Gowrishankar, et. al.B S Gowrishankar ... Nagappa U Bhajantri
23 Apr 2022
23 Apr 2022

Hybrid network model based on 3D convolutional neural network and scalable graph convolutional network for hyperspectral image classification
Xili Wang ... Zhengyin Liang
IET Image Processing | VOL. 17
Xili Wang, et. al.Xili Wang ... Zhengyin Liang
25 Sep 2022
IET Image Processing | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

DAFE-MSGAT: Dual-Attention Feature Extraction and Multi-Scale Graph Attention Network for Polyphonic Piano Transcription

Abstract

Talk to us

Similar Papers

More From: Electronics