Abstract
BackgroundGenetical genomics is a very powerful tool to elucidate the basis of complex traits and disease susceptibility. Despite its relevance, however, statistical modeling of expression quantitative trait loci (eQTL) has not received the attention it deserves. Based on two reasonable assertions (i) a good model should consider all available variables as potential effects, and (ii) gene expressions are highly interconnected, we suggest that an eQTL model should consider the rest of expression levels as potential regressors, in addition to the markers.ResultsIt is shown that power can be increased with this strategy. We also show, using classical statistical and support vector machines techniques in a reanalysis of public data, that the external transcripts, i.e., transcripts other than the one being analysed, explain on average much more variability than the markers themselves. The presence of eQTL hotspots is reassessed in the light of these results.ConclusionModel choice is a critical yet neglected issue in genetical genomics studies. Although we are far from having a general strategy for model choice in this area, we can at least propose that any transcript level is scanned not only for the markers genotyped but also for the rest of gene expression levels. Some sort of stepwise regression strategy can be used to select the final model.
Highlights
Genetical genomics is a very powerful tool to elucidate the basis of complex traits and disease susceptibility
The results are a collection of successive quantitative trait loci analysis, where each gene expression level is analysed independently
Based on two rather reasonable assertions (i) a good modelling strategy should consider all available variables as potential effects in the model, and (ii) gene expressions are highly interconnected, we suggest that an expression quantitative trait loci (eQTL) model for a given gene should consider the rest of expression levels as potential regressors as well as the markers to identify
Summary
Genetical genomics is a very powerful tool to elucidate the basis of complex traits and disease susceptibility. Based on two reasonable assertions (i) a good model should consider all available variables as potential effects, and (ii) gene expressions are highly interconnected, we suggest that an eQTL model should consider the rest of expression levels as potential regressors, in addition to the markers. Genetical genomics experiments have been analysed considering each expression level one at a time and using fairly simple statistical models, correcting only, e.g., by sex. Based on two rather reasonable assertions (i) a good modelling strategy should consider all available variables as potential effects in the model, and (ii) gene expressions are highly interconnected, we suggest that an eQTL model for a given gene should consider the rest of expression levels as potential regressors as well as the markers to identify (page number not for citation purposes)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have