Abstract

Genome-wide association studies (GWAS) have identified loci linked to hundreds of traits in many different species. Yet, because linkage equilibrium implicates a broad region surrounding each identified locus, the causal genes often remain unknown. This problem is especially pronounced in nonhuman, nonmodel species, where functional annotations are sparse and there is frequently little information available for prioritizing candidate genes. We developed a computational approach, Camoco, that integrates loci identified by GWAS with functional information derived from gene coexpression networks. Using Camoco, we prioritized candidate genes from a large-scale GWAS examining the accumulation of 17 different elements in maize (Zea mays) seeds. Strikingly, we observed a strong dependence in the performance of our approach based on the type of coexpression network used: expression variation across genetically diverse individuals in a relevant tissue context (in our case, roots that are the primary elemental uptake and delivery system) outperformed other alternative networks. Two candidate genes identified by our approach were validated using mutants. Our study demonstrates that coexpression networks provide a powerful basis for prioritizing candidate causal genes from GWAS loci but suggests that the success of such strategies can highly depend on the gene expression data context. Both the software and the lessons on integrating GWAS data with coexpression networks generalize to species beyond maize.

Highlights

  • Genome-wide association studies (GWAS) are a powerful tool for understanding the genetic basis of trait variation

  • The rationale for our approach is that genes that 139 function together in a biological process that are identified by GWAS should have non-random structure in co-expression networks that capture the same biological function

  • As input, a list of single-nucleotide polymorphisms (SNPs) associated with a trait of interest and a table of gene expression values and produces, as output, a list of high-priority candidate genes that are near GWAS peaks having evidence of strong co-expression with other genes associated with the trait of interest

Read more

Summary

Introduction

Genome-wide association studies (GWAS) are a powerful tool for understanding the genetic basis of trait variation. This approach has been successfully applied to hundreds of important traits in different species, including important yield-relevant traits in crops. Several quantitative trait loci (QTLs) composed of non-coding sequences have been previously reported in maize (Clark et al, 2006; Castelletti et al, 2014; Louwers et al, 2009). These challenging factors mean that even when a marker is strongly associated with a trait, many candidate genes are plausible until a causal polymorphism is identified

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call