Exploratory Gene Ontology Analysis with Interactive Visualization

Junjie Zhu,Eugene Katsevich,Chiara Sabatti,Qian Zhao

doi:10.1038/s41598-019-42178-x

Junjie Zhu, Eugene Katsevich + Show 2 more

Open Access

https://doi.org/10.1038/s41598-019-42178-x

Copy DOI

Journal: Scientific Reports	Publication Date: May 24, 2019
Citations: 10	License type: open-access

Affiliation: Stanford University

Abstract

The Gene Ontology (GO) is a central resource for functional-genomics research. Scientists rely on the functional annotations in the GO for hypothesis generation and couple it with high-throughput biological data to enhance interpretation of results. At the same time, the sheer number of concepts (>30,000) and relationships (>70,000) presents a challenge: it can be difficult to draw a comprehensive picture of how certain concepts of interest might relate with the rest of the ontology structure. Here we present new visualization strategies to facilitate the exploration and use of the information in the GO. We rely on novel graphical display and software architecture that allow significant interaction. To illustrate the potential of our strategies, we provide examples from high-throughput genomic analyses, including chromatin immunoprecipitation experiments and genome-wide association studies. The scientist can also use our visualizations to identify gene sets that likely experience coordinated changes in their expression and use them to simulate biologically-grounded single cell RNA sequencing data, or conduct power studies for differential gene expression studies using our built-in pipeline. Our software and documentation are available at http://aegis.stanford.edu.

Highlights

Data visualizations, by illustrating the number of terms, rendering the relations between them and displaying term annotations, can alleviate some of the aforementioned issues
The buoyant layout relies on a novel algorithm we developed, and improves the interpretation of the hierarchical levels in the Gene Ontology (GO): terms that are assigned to the same level share a similar number of annotated genes (Methods, Supplementary Algorithms, and Supplementary Note 1)
Version-controlled visualizations rendered via AEGIS can be downloaded as vector graphics, such that attributes such as colors, annotations and node positions can be customized via common editing tools

Summary

Introduction

By illustrating the number of terms, rendering the relations between them and displaying term annotations, can alleviate some of the aforementioned issues. In contrast to small graph displays, these tools are typically not flexible enough to highlight node- or link-specific details due to numerous visual elements. They can provide a global view of the graph, such as node clusters or overall hierarchies that are especially helpful for understanding trends of term relationships or enrichment scores in the GO. We link representations of local and global structures within the GO DAG by adopting the focus-and-context framework, reminiscent of classical principles in visual information system design: overview first, zoom and filter, details-on-demand[27]. AEGIS allows them to extract biological information relevant for simulations and hypothesis generation, as well as power calculations for study design

Methods

Results

Conclusion