Abstract

Canonical correlation analysis (CCA) has become a key tool for population neuroimaging, allowing investigation of associations between many imaging and non-imaging measurements. As age, sex and other variables are often a source of variability not of direct interest, previous work has used CCA on residuals from a model that removes these effects, then proceeded directly to permutation inference. We show that a simple permutation test, as typically used to identify significant modes of shared variation on such data adjusted for nuisance variables, produces inflated error rates. The reason is that residualisation introduces dependencies among the observations that violate the exchangeability assumption. Even in the absence of nuisance variables, however, a simple permutation test for CCA also leads to excess error rates for all canonical correlations other than the first. The reason is that a simple permutation scheme does not ignore the variability already explained by previous canonical variables. Here we propose solutions for both problems: in the case of nuisance variables, we show that transforming the residuals to a lower dimensional basis where exchangeability holds results in a valid permutation test; for more general cases, with or without nuisance variables, we propose estimating the canonical correlations in a stepwise manner, removing at each iteration the variance already explained, while dealing with different number of variables in both sides. We also discuss how to address the multiplicity of tests, proposing an admissible test that is not conservative, and provide a complete algorithm for permutation inference for CCA.

Highlights

  • Canonical correlation analysis (CCA) (Jordan, 1875; Hotelling, 1936) is a multivariate method that aims at reducing the correlation structure between two sets of variables to the simplest possible form through linear transformations of the variables within each set

  • Here we show that simple implementations of permutation inference for CCA are inadequate on four different grounds

  • We propose that inference that considers multiple canonical correlations should use a closed testing procedure that is more powerful than the usual correction method used in permutation tests that use the distribution of the maximum statistic; the procedure ensures a monotonic relationship between p-values and canonical correlations

Read more

Summary

Introduction

Canonical correlation analysis (CCA) (Jordan, 1875; Hotelling, 1936) is a multivariate method that aims at reducing the correlation structure between two sets of variables to the simplest possible form ( the name ‘‘canonical’’) through linear transformations of the variables within each set. From a peak use through from the late 1970’s until mid-1980’s, the method has recently regained popularity, presumably thanks to its ability to uncover latent, common factors underlying association between multiple measurements obtained, something relevant in recent research, in the field of brain imaging, that uses high dimensional phenotyping and investigates between-subject variability across multiple domains This is in contrast to initial studies that introduced CCA to the field (Friston et al, 1995, 1996; Worsley, 1997; Friman et al, 2001, 2002, 2003) for investigation of signal variation in functional magnetic resonance imaging (fMRI) time series.

Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.