Abstract

This paper presents an unsupervised learning algorithm for sparse nonnegative matrix factor time–frequency deconvolution with optimized fractional -divergence. The -divergence is a group of cost functions parametrized by a single parameter . The Itakura–Saito divergence, Kullback–Leibler divergence and Least Square distance are special cases that correspond to , respectively. This paper presents a generalized algorithm that uses a flexible range of that includes fractional values. It describes a maximization–minimization (MM) algorithm leading to the development of a fast convergence multiplicative update algorithm with guaranteed convergence. The proposed model operates in the time–frequency domain and decomposes an information-bearing matrix into two-dimensional deconvolution of factor matrices that represent the spectral dictionary and temporal codes. The deconvolution process has been optimized to yield sparse temporal codes through maximizing the likelihood of the observations. The paper also presents a method to estimate the fractional value. The method is demonstrated on separating audio mixtures recorded from a single channel. The paper shows that the extraction of the spectral dictionary and temporal codes is significantly more efficient by using the proposed algorithm and subsequently leads to better source separation performance. Experimental tests and comparisons with other factorization methods have been conducted to verify its efficacy.

Highlights

  • Blind source separation (BSS) [1,2,3,4,5,6,7,8] is an ill-posed problem that cannot be totally solved without some prior information

  • The analysis is necessary as the issue of sparsity of the temporal codes would undermine the quality of matrix factorization when the β value is inappropriately chosen

  • The proposed algorithm based on matrix factor time–frequency deconvolution is compared to conventional nonnegative matrix factorization (NMF) models

Read more

Summary

Introduction

Blind source separation (BSS) [1,2,3,4,5,6,7,8] is an ill-posed problem that cannot be totally solved without some prior information. This entails a certain number of assumptions have to be imposed to render the Sensors 2018, 18, 1371; doi:10.3390/s18051371 www.mdpi.com/journal/sensors. The problem of speech quality enhancement is tackled using adaptive and non-adaptive filtering algorithms [9]. The paper [10] proposes rational polynomial functions to replace the original score functions used in standard independent component analysis (ICA)

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.