Piano multipitch estimation using sparse coding embedded deep learning

Xingda Li,Yingnian Wu,Zhongbo Zhang,Yujing Guan

doi:10.1186/s13636-018-0132-x

Abstract

As the foundation of many applications, multipitch estimation problem has always been the focus of acoustic music processing; however, existing algorithms perform deficiently due to its complexity. In this paper, we employ deep learning to address piano multipitch estimation problem by proposing MPENet based on a novel multimodal sparse incoherent non-negative matrix factorization (NMF) layer. This layer originates from a multimodal NMF problem with Lorentzian-BlockFrobenius sparsity constraint and incoherentness regularization. Experiments show that MPENet achieves state-of-the-art performance (83.65% F-measure for polyphony level 6) on RAND subset of MAPS dataset. MPENet enables NMF to do online learning and accomplishes multi-label classification by using only monophonic samples as training data. In addition, our layer algorithms can be easily modified and redeveloped for a wide variety of problems.

Highlights

Multipitch estimation problem (MPE, cf. [1,2,3,4] and references therein) is the concurrent identification of multiple notes in an acoustic polyphonic music clip
{C4, D4}, {E0, G2, A5}, {F3, A3, C4, E4, G4, B4, D5}1, or other combinations. It is a prerequisite for Automatic Music Transcription (AMT, [5]), Musical Information Retrieval (MIR, [6]), and many other acoustic music processing applications
Based on and inspired by the above discussion, we in this paper propose Multipitch estimation network (MPENet), which is a deep learning (DL, [17,18,19,20,21]) network enhanced by a novel multimodal sparse incoherent negative matrix factorization (NMF) layer (MSI-NMF layer)

Summary

Introduction

Multipitch estimation problem (MPE, cf. [1,2,3,4] and references therein) is the concurrent identification of multiple notes in an acoustic polyphonic music clip. Multipitch estimation network (MPENet) : Given the decomposition capability of proposed layer, we employ “one vs all” strategy and present a unified deep learning network consisting of a training subnet and a test subnet. Other methods employing similar idea but different implementations are proposed in [2, 4, 26,27,28,29] Such procedure uses fixed dictionary to get note activations during test, so MPE results heavily depend on the learned note templates, i.e., training samples. Note that the deep learning methods listed here all use music pieces as training data, which means polyphonic information can be accessed, music language model and classifiers are learned simultaneously

Notation

Backward pass

Findings

Conclusions

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: EURASIP Journal on Audio, Speech, and Music Processing	Publication Date: Sep 12, 2018
Citations: 2	License type: open-access

R Discovery Prime

R Discovery Prime

Piano multipitch estimation using sparse coding embedded deep learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EURASIP Journal on Audio, Speech, and Music Processing

Lead the way for us

Similar Papers

Incremental learning based on block sparse kernel nonnegative matrix factorization
Wen-Sheng Chen ... Yugao Li
-
Wen-Sheng Chen, et. al.Wen-Sheng Chen ... Yugao Li
01 Jul 2016
01 Jul 2016

An Efficient Parallelization Model for Sparse Non-negative Matrix Factorization Using cuSPARSE Library on Multi-GPU Platform
Hatem Moumni ... Olfa Hamdi-Larbi
-
Hatem Moumni, et. al.Hatem Moumni ... Olfa Hamdi-Larbi
01 Jan 2021
01 Jan 2021

Sparse Orthogonal Nonnegative Matrix Factorization for Identifying Differentially Expressed Genes and Clustering Tumor Samples
Ling-Yun Dai ... Jin-Xing Liu
-
Ling-Yun Dai, et. al.Ling-Yun Dai ... Jin-Xing Liu
01 Dec 2018
01 Dec 2018

Independent component analysis, sparse component analysis and non-negative matrix factorization
S Amari
-
S AmariS Amari
01 Jan 2004
01 Jan 2004

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Piano multipitch estimation using sparse coding embedded deep learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EURASIP Journal on Audio, Speech, and Music Processing