Abstract

Genome-wide, imputed, sequence, and structural data are now available for exceedingly large sample sizes. The needs for data management, handling population structure and related samples, and performing associations have largely been met. However, the infrastructure to support analyses involving complexity beyond genome-wide association studies is not standardized or centralized. We provide the PLatform for the Analysis, Translation, and Organization of large-scale data (PLATO), a software tool equipped to handle multi-omic data for hundreds of thousands of samples to explore complexity using genetic interactions, environment-wide association studies and gene–environment interactions, phenome-wide association studies, as well as copy number and rare variant analyses. Using the data from the Marshfield Personalized Medicine Research Project, a site in the electronic Medical Records and Genomics Network, we apply each feature of PLATO to type 2 diabetes and demonstrate how PLATO can be used to uncover the complex etiology of common traits.

Highlights

  • Genome-wide, imputed, sequence, and structural data are available for exceedingly large sample sizes

  • We offer the PLatform for the Analysis, Translation, and Organization of large-scale data (PLATO) software as a multifaceted, unified tool for investigating complexity, including genetic interactions, environment-wide association studies (EWAS)[7] and gene–environment interactions, phenome-wide association studies (PheWAS)[8], and copy number and rare variant analyses (Fig. 1)

  • Marshfield Personalized Medicine Research Project (PMRP) is unique in its collection of multiple data types for thousands of samples: genotype, sequence, exposure, and copy number variant (CNV) data, as well as multiple phenotypes derived from electronic health record (EHR) data

Read more

Summary

Introduction

Genome-wide, imputed, sequence, and structural data are available for exceedingly large sample sizes. We provide the PLatform for the Analysis, Translation, and Organization of large-scale data (PLATO), a software tool equipped to handle multi-omic data for hundreds of thousands of samples to explore complexity using genetic interactions, environment-wide association studies and gene–environment interactions, phenome-wide association studies, as well as copy number and rare variant analyses. We offer the PLatform for the Analysis, Translation, and Organization of large-scale data (PLATO) software as a multifaceted, unified tool for investigating complexity, including genetic interactions, environment-wide association studies (EWAS)[7] and gene–environment interactions, phenome-wide association studies (PheWAS)[8], and copy number and rare variant analyses (Fig. 1). Association analysis Genome-wide association study Environment-wide association study Phenome-wide association study Differential CNV burden analysis Differential gene expression Gene set enrichment analysis Gene×gene interaction Gene×environment interaction Differential rare variation analysis Types of statistical tests Logistic Regression Linear Regression Firth Regression Likelihood ratio test Auto-detect regression type Mixed linear model association Family-based association Estimation of variance explained Polygenic modeling Meta-analysis Genetic encodings supported Additive encoding Dominant encoding Recessive encoding Codominant encoding Overdominant encoding Multiple test correction Bonferroni FDR Permutation QC filtering Marker call rate Sample call rate MAF LD Pruning IBD PCA

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call