Abstract

Motivation: The analysis of expression quantitative trait locus (eQTL) data is a challenging scientific endeavor, involving the processing of very large, heterogeneous and complex data. Typical eQTL analyses involve three types of data: sequence-based data reflecting the genotypic variations, gene expression data and meta-data describing the phenotype. Based on these, certain genotypes can be connected with specific phenotypic outcomes to infer causal associations of genetic variation, expression and disease.To this end, statistical methods are used to find significant associations between single nucleotide polymorphisms (SNPs) or pairs of SNPs and gene expression. A major challenge lies in summarizing the large amount of data as well as statistical results and to generate informative, interactive visualizations.Results: We present Reveal, our visual analytics approach to this challenge. We introduce a graph-based visualization of associations between SNPs and gene expression and a detailed genotype view relating summarized patient cohort genotypes with data from individual patients and statistical analyses.Availability: Reveal is included in Mayday, our framework for visual exploration and analysis. It is available at http://it.inf.uni-tuebingen.de/software/reveal/.Contact: guenter.jaeger@uni-tuebingen.de

Highlights

  • The risk to come down with a complex disease such as cancer or diabetes can be influenced by genetic variations

  • Reveal is based on three views, one focusing on the network of associations defined by the influences of single nucleotide polymorphisms (SNPs) on gene expression, the second providing detailed information on patient cohort genotypes grouped by meta-data and the third showing a traditional heatmap of the gene expression values

  • Edges are added between nodes as follows: based on the P-values computed for the association between SNP pairs and gene expression, create a triple < gi,gj,gk > of genes for each SNP pair with partners in gi and gj which is significantly associated with the gene expression of gk

Read more

Summary

INTRODUCTION

The risk to come down with a complex disease such as cancer or diabetes can be influenced by genetic variations. Expression quantitative trait loci (eQTL) studies go one step further, involving three types of data: sequence-based data reflecting the genotypic variations, gene expression data and meta-data describing the phenotype, e.g. the severity of disease or speed of progression This analysis is highly challenging, since it involves the scalable processing of very large, heterogeneous and complex data. The goal of eQTL data analysis is to generate a comprehensive picture that connects certain genotypes (individual genetic information) with specific phenotypic outcomes Based on this aggregated view of the data, analysts (e.g. biologists and bioinformaticians) are enabled to infer causal associations of genetic variation, expression and disease, i.e. to identify the genetic basis for phenotypic variations. We apply Reveal to the BioVis 2011 Contest dataset and discuss results generated

RELATED WORK
Association gene network
Genotype view
APPLICATION TO EQTL DATA
Findings
DISCUSSION
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call