Abstract

Recent advances in imaging genetics produce large amounts of data including functional MRI images, single nucleotide polymorphisms (SNPs), and cognitive assessments. Understanding the complex interactions among these heterogeneous and complementary data has the potential to help with diagnosis and prevention of mental disorders. However, limited efforts have been made due to the high dimensionality, group structure, and mixed type of these data. In this paper we present a novel method to detect conditional associations between imaging genetics data. We use projected distance correlation to build a conditional dependency graph among high-dimensional mixed data, then use multiple testing to detect significant group level associations (e.g., ROI-gene). In addition, we introduce a scalable algorithm based on orthogonal greedy algorithm, yielding the greedy projected distance correlation (G-PDC). This can reduce the computational cost, which is critical for analyzing large-volume of imaging genomics data. The results from our simulations demonstrate a higher degree of accuracy with GPDC than distance correlation, Pearson's correlation and partial correlation, especially when the correlation is nonlinear. Finally, we apply our method to the Philadelphia Neurodevelopmental data cohort with 866 samples including fMRI images and SNP profiles. The results uncover several statistically significant and biologically interesting interactions, which are further validated with many existing studies. The Matlab code is available at https://sites.google.com/site/jianfang86/gPDC.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call