Abstract

This paper presents Group-sparse Nonnegative supervised Canonical Correlation Analysis (GNCCA), a novel methodology for identifying discriminative features from multiple feature views. Existing correlation-based methods do not guarantee positive correlations of the selected features and often need a pre-feature selection step to reduce redundant features on each feature view. The new GNCCA approach attempts to overcome these issues by incorporating (1) a nonnegativity constraint that guarantees positive correlations in the reduced representation and (2) a group-sparsity constraint that allows for simultaneous between- and within- view feature selection. In particular, GNCCA is designed to emphasize correlations between feature views and class labels such that the selected features guarantee better class separability. In this work, GNCCA was evaluated on three prostate cancer (CaP) prognosis tasks: (i) identifying 40 CaP patients with and without 5-year biochemical recurrence following radical prostatectomy by fusing quantitative features extracted from digitized pathology and proteomics, (ii) predicting in vivo prostate cancer grade for 16 CaP patients by fusing T2w and DCE MRI, and (iii) localizing CaP/benign regions on MR spectroscopy and MRI for 36 patients. For the three tasks, GNCCA identifies a feature subset comprising 2%, 1% and 22%, respectively, of the original extracted features. These selected features achieve improved or comparable results compared to using all features with the same Support Vector Machine (SVM) classifier. In addition, GNCCA consistently outperforms 5 state-of-the-art feature selection methods across all three datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call