Abstract
Missing values are prevalent in microarray data, they course negative influence on downstream microarray analyses, and thus they should be estimated from known values. We propose a BPCA-iLLS method, which is an integration of two commonly used missing value estimation methods—Bayesian principal component analysis (BPCA) and local least squares (LLS). The inferior row-average procedure in LLS is replaced with BPCA, and the least squares method is put into an iterative framework. Comparative result shows that the proposed method has obtained the highest estimation accuracy across all missing rates on different types of testing datasets.
Highlights
Data generated from DNA microarray data is useful for various biological applications; the data is in the form a large matrices
We propose a BPCA-iterated Local Least Squares (iLLS) method, which is an integration of two commonly used missing value estimation methods—Bayesian principal component analysis (BPCA) and local least squares (LLS)
Most recently proposed local methods are based on LLS, including iterated Local Least Squares, weighted local least squares and iterative bicluster-based least squares
Summary
Data generated from DNA microarray data is useful for various biological applications; the data is in the form a large matrices. Among all kinds of microarray missing value estimation methods, BPCA and local least squares (LLS) are two most widely used approaches The former is based on the global. According to a survey [8] about different microarray missing value estimation methods, BPCA performs better than LLS on datasets with lower complexity, whereas due to another survey [9], LLS is superior than BPCA in the presence of data with dominant local similarity structures. This phenomenon inspires us to integrate the two methods, with the hope of improving the estimation accuracy and robustness.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have