Abstract

BackgroundIn the last decade, Genome-wide Association studies (GWASs) have contributed to decoding the human genome by uncovering many genetic variations associated with various diseases. Many follow-up investigations involve joint analysis of multiple independently generated GWAS data sets. While most of the computational approaches developed for joint analysis are based on summary statistics, the joint analysis based on individual-level data with consideration of confounding factors remains to be a challenge.ResultsIn this study, we propose a method, called Coupled Mixed Model (CMM), that enables a joint GWAS analysis on two independently collected sets of GWAS data with different phenotypes. The CMM method does not require the data sets to have the same phenotypes as it aims to infer the unknown phenotypes using a set of multivariate sparse mixed models. Moreover, CMM addresses the confounding variables due to population stratification, family structures, and cryptic relatedness, as well as those arising during data collection such as batch effects that frequently appear in joint genetic studies. We evaluate the performance of CMM using simulation experiments. In real data analysis, we illustrate the utility of CMM by an application to evaluating common genetic associations for Alzheimer’s disease and substance use disorder using datasets independently collected for the two complex human disorders. Comparison of the results with those from previous experiments and analyses supports the utility of our method and provides new insights into the diseases. The software is available at https://github.com/HaohanWang/CMM.

Highlights

  • In the last decade, Genome-wide Association studies (GWASs) have contributed to decoding the human genome by uncovering many genetic variations associated with various diseases

  • MTAG: multi-trait analysis of GWAS [21], which is a method for joint analysis of GWAS data sets using summary statistics, which accounts for potential confounders due to population stratification or cryptic relatedness

  • Our results show that rs224534 identified by Coupled Mixed Model (CMM) to be associated with both Alzheimer’s disease (AD) and Substance use disorder (SUD) resides in TRPV1 which encodes transient receptor potential cation channel subfamily V member 1

Read more

Summary

Introduction

Genome-wide Association studies (GWASs) have contributed to decoding the human genome by uncovering many genetic variations associated with various diseases. Many follow-up investigations involve joint analysis of multiple independently generated GWAS data sets. Genome-wide Association Studies (GWASs) have helped reveal about 10,000 associations between genetic variants in the human genome and diseases [1]. With the success of GWASs involving analysis of single data sets, a natural follow-up is to investigate multiple data sets [2], which we refer to as joint analysis. A joint genetic analysis using two independently collected data sets can be very challenging. For the two data sets 1 and 2 originally collected for independent studies of the red and blue phenotype, respectively, a joint analysis aims to discover common genetic variants associated with both of these phenotypes.

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call