Prior knowledge guided eQTL mapping for identifying candidate genes.

Yunli Wang,Youlian Pan,Rene Richard

doi:10.1186/s12859-016-1387-9

Abstract

BackgroundExpression quantitative trait loci (eQTL) mapping is often used to identify genetic loci and candidate genes correlated with traits. Although usually a group of genes affect complex traits, genes in most eQTL mapping methods are considered as independent. Recently, some eQTL mapping methods have accounted for correlated genes, used biological prior knowledge and applied these in model species such as yeast or mouse. However, biological prior knowledge might be very limited for most species.ResultsWe proposed a data-driven prior knowledge guided eQTL mapping for identifying candidate genes. At first, quantitative trait loci (QTL) analysis was used to identify single nucleotide polymorphisms (SNP) markers that are associated with traits. Then co-expressed gene modules were generated and gene modules significantly associated with traits were selected. Prior knowledge from QTL mapping was used for eQTL mapping on the selected modules. We tested and compared prior knowledge guided eQTL mapping to the eQTL mapping with no prior knowledge in a simulation study and two barley stem rust resistance case studies.The results in simulation study and real barley case studies show that models using prior knowledge outperform models without prior knowledge. In the first case study, three gene modules were selected and one of the gene modules was enriched with defense response Gene Ontology (GO) terms. Also, one probe in the gene module is mapped to Rpg1, previously identified as resistance gene to stem rust. In the second case study, four gene modules are identified, one gene module is significantly enriched with defense response to fungus and bacterium.ConclusionsPrior knowledge guided eQTL mapping is an effective method for identifying candidate genes. The case studies in stem rust show that this approach is robust, and outperforms methods with no prior knowledge in identifying candidate genes.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-016-1387-9) contains supplementary material, which is available to authorized users.

Highlights

Expression quantitative trait loci mapping is often used to identify genetic loci and candidate genes correlated with traits
We compared the performance of these models using the root-mean-squared errors (RMSE), areas under the precision and recall curve (AUC), and degree of freedom (DF)
In Expression quantitative trait loci (eQTL) mapping, usually a small number of genetic markers are associated with genes, so lower DF means less number of genetic markers in the model

Summary

Introduction

Expression quantitative trait loci (eQTL) mapping is often used to identify genetic loci and candidate genes correlated with traits. Some eQTL mapping methods have accounted for correlated genes, used biological prior knowledge and applied these in model species such as yeast or mouse. The first step for discovering candidate genes is to identify chromosome regions associated with a particular quantitative trait through Quantitative trait loci (QTL) mapping. Expression quantitative trait loci (eQTL) mapping has been applied to identify regulatory regions for genes from transcriptome and genotype data. Traditional linkage mapping methods such as HaleyKnott regression(HK) and composite interval mapping (CIM) have been widely used for QTL mapping and recently on eQTL mapping [1]. Both HK and CIM assume that traits (QTL mapping) or genes (eQTL mapping) are not related. It showed that Lasso outperformed CIM and HK for eQTL mapping [2]

Objectives

Methods

Results

Discussion

Conclusion