Abstract

This paper is concerned with pattern recognition for 2-class problems in a High Dimension Low Sample Size (hdlss) setting. The proposed method is based on canonical correlations between the predictors X and responses Y. The paper proposes a modified version of the canonical correlation matrix ΣX−1/2ΣXYΣY−1/2 which is suitable for discrimination with class labels Y in a hdlss context. The modified canonical correlation matrix yields ranking vectors for variable selection, a discriminant direction and a rule which is essentially equivalent to the naive Bayes rule. The paper examines the asymptotic behavior of the ranking vectors and the discriminant direction and gives precise conditions for hdlss consistency in terms of the growth rates of the dimension and sample size. The feature selection induced by the discriminant direction as ranking vector is shown to work efficiently in simulations and in applications to real hdlss data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.