Abstract

Most tools developed to visualize hierarchically clustered heatmaps generate static images. Clustergrammer is a web-based visualization tool with interactive features such as: zooming, panning, filtering, reordering, sharing, performing enrichment analysis, and providing dynamic gene annotations. Clustergrammer can be used to generate shareable interactive visualizations by uploading a data table to a web-site, or by embedding Clustergrammer in Jupyter Notebooks. The Clustergrammer core libraries can also be used as a toolkit by developers to generate visualizations within their own applications. Clustergrammer is demonstrated using gene expression data from the cancer cell line encyclopedia (CCLE), original post-translational modification data collected from lung cancer cells lines by a mass spectrometry approach, and original cytometry by time of flight (CyTOF) single-cell proteomics data from blood. Clustergrammer enables producing interactive web based visualizations for the analysis of diverse biological data.

Highlights

  • The diversity of high content experimental methods in biomedical research is rapidly growing

  • Data visualization is a central tool for the initial analysis of biological data, and dimensionality reduction techniques, such as principal component analysis (PCA)[1] and t-distributed stochastic neighbor embedding (t-SNE)[2] are commonly employed to project high dimensional data onto two or three dimensions so it can be visualized

  • Jupyter notebooks with embedded interactive heatmaps can be shared on the web using GitHub and the notebook rendering service, NBviewer (Fig. 1b), Clustergrammer visualizations embedded within Jupyter Notebooks are portable and can be integrated into existing workflows

Read more

Summary

Introduction

The diversity of high content experimental methods in biomedical research is rapidly growing. Data visualization is a central tool for the initial analysis of biological data, and dimensionality reduction techniques, such as principal component analysis (PCA)[1] and t-distributed stochastic neighbor embedding (t-SNE)[2] are commonly employed to project high dimensional data onto two or three dimensions so it can be visualized. A clustergram, or a heatmap, on the other hand, is one of several techniques that directly visualizes data without the need for dimensionality reduction[3]. Clustered heatmaps can be used to visualize biological networks by displaying network connections in a symmetric adjacency matrix[4]. In such a display, the nodes of the network are the rows and columns and network links are represented as the cells within the matrix

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call