Abstract

An enormous challenge in the post-genome era is to annotate and resolve the consequences of genetic variation on diverse phenotypes. The genome-wide association study (GWAS) is a well-known method to identify potential genetic loci for complex traits from huge genetic variations, following which it is crucial to identify expression quantitative trait loci (eQTL). However, the conventional eQTL methods usually disregard the systematical role of single-nucleotide polymorphisms (SNPs) or genes, thereby overlooking many network-associated phenotypic determinates. Such a problem motivates us to recognize the network-based quantitative trait loci (QTL), i.e., network QTL (nQTL), which is to detect the cascade association as genotype → network → phenotype rather than conventional genotype → expression → phenotype in eQTL. Specifically, we develop the nQTL framework on the theory and approach of single-sample networks, which can identify not only network traits (e.g., the gene subnetwork associated with genotype) for analyzing complex biological processes but also network signatures (e.g., the interactive gene biomarker candidates screened from network traits) for characterizing targeted phenotype and corresponding subtypes. Our results show that the nQTL framework can efficiently capture associations between SNPs and network traits (i.e., edge traits) in various simulated data scenarios, compared with traditional eQTL methods. Furthermore, we have carried out nQTL analysis on diverse biological and biomedical datasets. Our analysis is effective in detecting network traits for various biological problems and can discover many network signatures for discriminating phenotypes, which can help interpret the influence of nQTL on disease subtyping, disease prognosis, drug response, and pathogen factor association. Particularly, in contrast to the conventional approaches, the nQTL framework could also identify many network traits from human bulk expression data, validated by matched single-cell RNA-seq data in an independent or unsupervised manner. All these results strongly support that nQTL and its detection framework can simultaneously explore the global genotype–network–phenotype associations and the underlying network traits or network signatures with functional impact and importance.

Highlights

  • An enormous challenge in the post-genome era is to annotate and resolve the consequences of diverse genetic variations (Lynch and Hsiao, 2019; Strober et al, 2019; Young et al, 2019), within the context of human diseases (Gibbs et al, 2010; Liang et al, 2013; Zeggini et al, 2019)

  • As a crucial mechanism of genetic variants that affect gene expression (Kang et al, 2012; Peters et al, 2016; Strunz et al, 2018), expression quantitative trait loci indicate genomic loci that contribute to variations in gene expression levels, which reveals the connection between single-nucleotide polymorphisms (SNPs) and genes on functions rather than on sequences (Li et al, 2016), supplying detailed functional explanations of genome-wide association study (GWAS) outcomes (Michaelson et al, 2009; Holloway et al, 2011; Peterson et al, 2016; Joehanes et al, 2017; Son et al, 2017; Guo et al, 2018)

  • As known in GWAS, there will be a large deviation at one SNP site in the QQ-plot, suggesting that the deviation of the observed value of this SNP site is caused by the genetic effects of this SNP mutation

Read more

Summary

Introduction

An enormous challenge in the post-genome era is to annotate and resolve the consequences of diverse genetic variations (Lynch and Hsiao, 2019; Strober et al, 2019; Young et al, 2019), within the context of human diseases (Gibbs et al, 2010; Liang et al, 2013; Zeggini et al, 2019). The efficiency issues of eQTL methods are still widely focused in many methodology studies, such as how to illuminate the full structure of the eQTL data (Huang et al, 2009); how to distinguish true causal polymorphisms or causal factors (Suthram et al, 2008; Lee et al, 2009; Chipman and Singh, 2011; Wang and Zhang, 2011); how to implement multiple-comparison adjustment (Chen et al, 2008) or confounding factor removal (Ju et al, 2017; Yuan et al, 2017); how to detect group-wise and individual associations between SNPs and expression traits (Cheng et al, 2015; 2016); and how to calculate fast for the computationally intensive part of the eQTL identification algorithm (Shabalin, 2012). Many of the conventional eQTL methods tend to use the network concept or model to interpret the biological or biomedical significance of their discovery (Sun et al, 2007; Verbeke et al, 2013; Cheng et al, 2014; Ho et al, 2014; Zhang and Kim, 2014; De Maeyer et al, 2016), most of them derive the associations between SNP and gene groups rather than between SNP and gene-pair/edge groups (networks); i.e., they usually disregard the systematical role of those discovered SNPs or genes, thereby overlooking many network-associated phenotypic determinates

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call