Abstract
BackgroundCluster heatmaps are commonly used in biology and related fields to reveal hierarchical clusters in data matrices. This visualization technique has high data density and reveal clusters better than unordered heatmaps alone. However, cluster heatmaps have known issues making them both time consuming to use and prone to error. We hypothesize that visualization techniques without the rigid grid constraint of cluster heatmaps will perform better at clustering-related tasks.ResultsWe developed an approach to “unbox” the heatmap values and embed them directly in the hierarchical clustering results, allowing us to use standard hierarchical visualization techniques as alternatives to cluster heatmaps. We then tested our hypothesis by conducting a survey of 45 practitioners to determine how cluster heatmaps are used, prototyping alternatives to cluster heatmaps using pair analytics with a computational biologist, and evaluating those alternatives with hour-long interviews of 5 practitioners and an Amazon Mechanical Turk user study with approximately 200 participants. We found statistically significant performance differences for most clustering-related tasks, and in the number of perceived visual clusters. Visit git.io/vw0t3 for our results.ConclusionsThe optimal technique varied by task. However, gapmaps were preferred by the interviewed practitioners and outperformed or performed as well as cluster heatmaps for clustering-related tasks. Gapmaps are similar to cluster heatmaps, but relax the heatmap grid constraints by introducing gaps between rows and/or columns that are not closely clustered. Based on these results, we recommend users adopt gapmaps as an alternative to cluster heatmaps.
Highlights
Cluster heatmaps are commonly used in biology and related fields to reveal hierarchical clusters in data matrices
Practitioner survey: We surveyed 45 practitioners in biology or related fields to understand how they use cluster heatmaps and determine the scope of experiments that would be useful to these practitioners
We conducted several qualitative and quantitative studies to test whether hierarchical visualization techniques without the rigid grid constraint of cluster heatmaps perform better at clustering-related tasks
Summary
Cluster heatmaps are commonly used in biology and related fields to reveal hierarchical clusters in data matrices This visualization technique has high data density and reveal clusters better than unordered heatmaps alone. Heatmaps visualize a data matrix by drawing a rectangular grid corresponding to rows and columns in the matrix, and coloring the cells by their values in the data matrix. In their most basic form, heatmaps have been used for over a century [1]. Cluster heatmaps have high data density, allowing them to compact large amounts of information into a small space [2]
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have