Abstract

BackgroundNon-negative matrix factorization (NMF) has been introduced as an important method for mining biological data. Though there currently exists packages implemented in R and other programming languages, they either provide only a few optimization algorithms or focus on a specific application field. There does not exist a complete NMF package for the bioinformatics community, and in order to perform various data mining tasks on biological data.ResultsWe provide a convenient MATLAB toolbox containing both the implementations of various NMF techniques and a variety of NMF-based data mining approaches for analyzing biological data. Data mining approaches implemented within the toolbox include data clustering and bi-clustering, feature extraction and selection, sample classification, missing values imputation, data visualization, and statistical comparison.ConclusionsA series of analysis such as molecular pattern discovery, biological process identification, dimension reduction, disease prediction, visualization, and statistical comparison can be performed using this toolbox.

Highlights

  • Non-negative matrix factorization (NMF) has been introduced as an important method for mining biological data

  • In order to address the lack of data mining functionalities and generality of current NMF toolboxes, we propose a general NMF toolbox in MATLAB which is implemented in two levels

  • N ( λ2 i=1 2 yi subject to A ≥ 0 i.e., if t1 = 1, Y ≥ 0 i.e., if t2 = 1 where, parameters: α1 ≥ 0 controls the sparsity of the basis vectors; α2 ≥ 0 controls the smoothness and the scale of the basis vectors; λ1 ≥ 0 controls the sparsity of the coefficient vectors; λ2 ≥ 0 controls the smoothness of the coefficient vectors; and, parameters t1 and t2 are boolean variables (0: false, 1: true) which indicate if nonnegativity needs to be enforced on A or Y, respectively

Read more

Summary

Results

We provide a convenient MATLAB toolbox containing both the implementations of various NMF techniques and a variety of NMF-based data mining approaches for analyzing biological data. Data mining approaches implemented within the toolbox include data clustering and bi-clustering, feature extraction and selection, sample classification, missing values imputation, data visualization, and statistical comparison

Conclusions
Background
Results and discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call