Abstract

Biomedical data such as microarray data are typified by high dimensionality and small sample size. Feature selection (FS) is a predominant technique to find informative features as a means of biomarker from this huge amount of data. However, many early studies report acquisition noise that results unreliable supposition to select features. The methods which rely on computing mutual information between features will lead to poor generalization. To the best of our knowledge very few feature selection methods are proposed to address these two problems together. This paper presents a novel FS method based on Network of Canonical Correlation Analysis, NCCA, which sounds robust to acquisition noise and ignores mutual information computation. Two strong strategies that distinguish NCCA than other methods are adopted in NCCA namely ‘training with noisy features’ and ‘maximum correlation and minimum redundancies’ to address the above two problems. As a result informative feature subset is converged. NCCA has been applied to different types of biomedical dataset having very high dimension with different aspects such as microarray, gene expression and voice signal. In order to get reliable results, NCCA is evaluated with two classifiers – neural network (NN) and support vector machine (SVM). The result of NCCA is very robust in terms of elapsed time, accuracy and Mean Square Error (MSE) with respect to mutual information based methods such as an information gain (IG) method. The computational complexity of NCCA has been shown to be less than IG theoretically and experimentally. It is observed that NCCA is about 2–19 times faster than IG depending on the size of dataset. NCCA is further compared with other standard methods in the literature and it is found to be better than other techniques.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.