Abstract
In recent years, the advent of great technological advances has produced a wealth of very high-dimensional data, and combining high-dimensional information from multiple sources is becoming increasingly important in an extending range of scientific disciplines. Partial Least Squares Correlation (PLSC) is a frequently used method for multivariate multimodal data integration. It is, however, computationally expensive in applications involving large numbers of variables, as required, for example, in genetic neuroimaging. To handle high-dimensional problems, dimension reduction might be implemented as pre-processing step. We propose a new approach that incorporates Random Projection (RP) for dimensionality reduction into PLSC to efficiently solve high-dimensional multimodal problems like genotype-phenotype associations. We name our new method PLSC-RP. Using simulated and experimental data sets containing whole genome SNP measures as genotypes and whole brain neuroimaging measures as phenotypes, we demonstrate that PLSC-RP is drastically faster than traditional PLSC while providing statistically equivalent results. We also provide evidence that dimensionality reduction using RP is data type independent. Therefore, PLSC-RP opens up a wide range of possible applications. It can be used for any integrative analysis that combines information from multiple sources.
Highlights
The majority of human neurological and psychiatric disorders are substantially heritable (Plomin et al, 1994; Meyer-Lindenberg and Weinberger, 2006; Bigos and Weinberger, 2010; Ge et al, 2013)
First we compared the results of traditional Partial Least Squares Correlation (PLSC) and PLSC-Random Projection (RP) on simulated brain imaging data of increasing dimensionality and candidate single-nucleotide polymorphisms (SNPs)
To verify our findings on simulated data, we compared the results of traditional PLSC and PLSC-RP regarding experimental brain imaging and genetics data
Summary
The majority of human neurological and psychiatric disorders are substantially heritable (Plomin et al, 1994; Meyer-Lindenberg and Weinberger, 2006; Bigos and Weinberger, 2010; Ge et al, 2013) Since these illnesses represent an actual problem of public health, it is vitally important to understand the underlying genetic mechanisms. Substantial progress has been achieved in PLSC-RP for Multivariate Correlation Analysis recent years with the emergence of genome-wide association (GWA) studies (Haines et al, 2005). These studies focus on single-nucleotide polymorphisms (SNPs), the most common type of human genetic variation (Wang et al, 1998; Crawford and Nickerson, 2005). Measures derived from invivo anatomical or functional neuroimaging were increasingly introduced as intermediate phenotypes for genetic association analyses
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.