Abstract

For classification problems based on microarray data, the data typically contains a large number of irrelevant and redundant features. In this paper, a new gene selection method is proposed to choose the best subset of features for microarray data with the irrelevant and redundant features removed. We formulate the selection problem as a L1-regularized optimization problem, based on a newly defined linear discriminant analysis criterion. Instead of calculating the mean of the samples, a kernel-based approach is used to estimate the class centroid to define both the between-class separability and the within-class compactness for the criterion. Theoretical analysis indicates that the global optimal solution of the L1-regularized criterion can be reached with a general condition, on which an efficient algorithm is derived to the feature selection problem in a linear time complexity with respect to the number of features and the number of samples. The experimental results on ten publicly available microarray datasets demonstrate that the proposed method performs effectively and competitively compared with state-of-the-art methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.