Abstract

Brain imaging genetics intends to uncover associations between genetic markers and neuroimaging quantitative traits. Sparse canonical correlation analysis (SCCA) can discover bi-multivariate associations and select relevant features, and is becoming popular in imaging genetic studies. The L1-norm function is not only convex, but also singular at the origin, which is a necessary condition for sparsity. Thus most SCCA methods impose {ell }_{{bf{1}}}-norm onto the individual feature or the structure level of features to pursuit corresponding sparsity. However, the {ell }_{{bf{1}}}-norm penalty over-penalizes large coefficients and may incurs estimation bias. A number of non-convex penalties are proposed to reduce the estimation bias in regression tasks. But using them in SCCA remains largely unexplored. In this paper, we design a unified non-convex SCCA model, based on seven non-convex functions, for unbiased estimation and stable feature selection simultaneously. We also propose an efficient optimization algorithm. The proposed method obtains both higher correlation coefficients and better canonical loading patterns. Specifically, these SCCA methods with non-convex penalties discover a strong association between the APOE e4 rs429358 SNP and the hippocampus region of the brain. They both are Alzheimer’s disease related biomarkers, indicating the potential and power of the non-convex methods in brain imaging genetics.

Highlights

  • The CCA technique has been introduced for several decades[24]

  • The L1-S2CCA and Smoothly Clipped Absolute Deviation (SCAD) methods identify a weak signal from the parahippocampal gyrus, which is previously reported as an early biomarker of AD54

  • We have proposed a unified non-convex Sparse canonical correlation analysis (SCCA) model and an efficient optimization algorithm using a family of non-convex penalty functions

Read more

Summary

Introduction

The CCA technique has been introduced for several decades[24]. CCA can only perform well when the number of observations is larger than the combined feature number of the two views. These penalties includes the γ-norm (0 < γ < 1) penalty[42], the Geman penalty[43], the Smoothly Clipped Absolute Deviation (SCAD) penalty[38], the Laplace penalty[44], the Minimax Concave Penalty (MCP)[45], the Exponential-Type Penalty (ETP)[46] and the Logarithm penalty[47] These non-convex functions have attractive theoretical properties for they all are singular at the origin and leave those larger coefficients unpenalized. Though they have gained great success in generalized linear models (GLMs), it is an unexplored topic to apply them to the SCCA models for achieving sparsity and unbiased prediction simultaneously

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call