Genetic dissection of complex traits using hierarchical biological knowledge

Hidenori Tanaka,Jason F Kreisberg,Trey Ideker

doi:10.1371/journal.pcbi.1009373

Hidenori Tanaka, Jason F Kreisberg + Show 1 more

Open Access

https://doi.org/10.1371/journal.pcbi.1009373

Copy DOI

Abstract

Despite the growing constellation of genetic loci linked to common traits, these loci have yet to account for most heritable variation, and most act through poorly understood mechanisms. Recent machine learning (ML) systems have used hierarchical biological knowledge to associate genetic mutations with phenotypic outcomes, yielding substantial predictive power and mechanistic insight. Here, we use an ontology-guided ML system to map single nucleotide variants (SNVs) focusing on 6 classic phenotypic traits in natural yeast populations. The 29 identified loci are largely novel and account for ~17% of the phenotypic variance, versus <3% for standard genetic analysis. Representative results show that sensitivity to hydroxyurea is linked to SNVs in two alternative purine biosynthesis pathways, and that sensitivity to copper arises through failure to detoxify reactive oxygen species in fatty acid metabolism. This work demonstrates a knowledge-based approach to amplifying and interpreting signals in population genetic studies.

Highlights

In recent decades, genome-wide association studies (GWAS) in humans have identified almost 19,000 associations between genetic loci and phenotypic traits [1]
We find that sensitivity to hydroxyurea is tied to genetic variants in two alternative purine biosynthesis pathways, and that sensitivity to copper arises through failure to detoxify reactive oxygen species in fatty acid metabolism
We explored mapping of causal variants from GWAS using genotype-phenotype data previously gathered in approximately 1000 natural S. cerevisiae isolates [29]

Summary

Introduction

Genome-wide association studies (GWAS) in humans have identified almost 19,000 associations between genetic loci and phenotypic traits [1]. Of the various explanations put forward for this phenomenon, a frequently discussed possibility is that complex disease genetics are driven by large numbers of alleles, each with small effect sizes, making them hard to detect through genome-wide association [3] To address this challenge, more complex models such as polygenic risk scores (PRS) have been developed, which sum effects across many variants to predict phenotype [4,5,6]. As many of the variants identified by GWAS are located in noncoding regions, follow-up experiments typically entail reporter assays, validations of transcription factor binding sides, animal models and genome engineering [11,12,13] Even these techniques do not begin to address functional effects of the variant beyond the gene, such as impacts on the states of proteins, protein complexes, metabolic processes and signaling pathways, and composition of cell types. The process of translating an associated locus to a causal single nucleotide variant (SNV) and to a causal gene and subsequent underlying biological mechanism is still far from routine

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLOS Computational Biology	Publication Date: Sep 17, 2021
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Genetic dissection of complex traits using hierarchical biological knowledge

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS Computational Biology

Lead the way for us

Similar Papers

Engineering problems in machine learning systems
Hiroshi Kuwajima ... Toshihiro Nakae
Machine Learning | VOL. 109
Hiroshi Kuwajima, et. al.Hiroshi Kuwajima ... Toshihiro Nakae
23 Apr 2020
Machine Learning | VOL. 109

Securing Machine Learning Architectures and Systems
Shirin Hajiamin Shirazi ... Hoda Naghibijouybari
-
Shirin Hajiamin Shirazi, et. al.Shirin Hajiamin Shirazi ... Hoda Naghibijouybari
07 Sep 2020
07 Sep 2020

Improving Early Fault Detection in Machine Learning Systems Using Data Diversity-Driven Metamorphic Relation Prioritization
Madhusudan Srinivasan ... Upulee Kanewala
Electronics | VOL. 13
Madhusudan Srinivasan, et. al.Madhusudan Srinivasan ... Upulee Kanewala
26 Aug 2024
Electronics | VOL. 13

Understanding natural epigenetic variation
Christina L Richards ... Oliver Bossdorf
New Phytologist | VOL. 187
Christina L Richards, et. al.Christina L Richards ... Oliver Bossdorf
19 Jul 2010
New Phytologist | VOL. 187

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Genetic dissection of complex traits using hierarchical biological knowledge

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS Computational Biology