Abstract

Imaging genetics research based on Sparse Canonical Correlation Analysis (SCCA) helps to discover the correlation between pathological features reflected by neuroimaging and genotypic variation. Multi-Task SCCA (MTSCCA) method is to identify bi-multivariate associations between SNPs and multi-modal imaging QTs. However, the MTSCCA method is unsupervised and cannot identify diagnosis-guided genotype-phenotype associations. In order to improve the performance and interpretability of MTSCCA, we propose an improved MTSCCA algorithm, which is a supervised sparse bivariate learning model fused with a linear regression model, in which the regression part plays a guiding role in imaging QT selection. To jointly understand the relationship between genotypes and phenotypes of multiple tasks, in this study, gene expression data and single nucleotide polymorphisms (SNPs) as genetic data are considered in the algorithm. The focus of each task is to determine the genotype-phenotype pattern guided by the diagnostic team to discover the association with SNP/gene and brain region changes. Besides, the Laplacian matrices of three kinds of data are added as prior knowledge to the algorithm penalty item so that the algorithm can analyze the correlation between different features. Compared with other SCCA methods, our algorithm has improved noise resistance and stability, and found some diagnostic-specific SNP/gene-ROI specific to the two diagnostic groups of MCI and AD. Significance: This method provides a way to further study the association of multi-modal biological data and identify the complex association patterns of diseases.

Highlights

  • In recent years, imaging genetics has become an important research topic because of its ability to explore the effects of genes on the structure and function of the brain and reveal the pathogenesis of some brain diseases

  • Sparse Canonical Correlation Analysis (SCCA) is a robust and scalable multiple association analysis algorithm, which has been widely used in the field of image genetics [2,3,4,5,6,7,8,9]

  • Compared with the constraints of existing methods, we propose an improved multi-task SCCA method

Read more

Summary

Introduction

In recent years, imaging genetics has become an important research topic because of its ability to explore the effects of genes on the structure and function of the brain and reveal the pathogenesis of some brain diseases. Canonical Correlation Analysis (CCA) [1] is a standard multivariate method that integrates two or more data types. It can maximize the linear combination of the most remarkable correlation among different types of variables and obtain the interrelated components of the two sets of data. When using traditional CCA methods, severe overfitting may occur In response to this problem, the sparse-constrained CCA method [23] was introduced, which can identify bivariate associations between multiple SNPs and multiple imaging QTs. Many scholars made improvements based on SCCA.

Objectives
Methods
Findings
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call