Abstract
BackgroundThough cluster analysis has become a routine analytic task for bioinformatics research, it is still arduous for researchers to assess the quality of a clustering result. To select the best clustering method and its parameters for a dataset, researchers have to run multiple clustering algorithms and compare them. However, such a comparison task with multiple clustering results is cognitively demanding and laborious.ResultsIn this paper, we present XCluSim, a visual analytics tool that enables users to interactively compare multiple clustering results based on the Visual Information Seeking Mantra. We build a taxonomy for categorizing existing techniques of clustering results visualization in terms of the Gestalt principles of grouping. Using the taxonomy, we choose the most appropriate interactive visualizations for presenting individual clustering results from different types of clustering algorithms. The efficacy of XCluSim is shown through case studies with a bioinformatician.ConclusionsCompared to other relevant tools, XCluSim enables users to compare multiple clustering results in a more scalable manner. Moreover, XCluSim supports diverse clustering algorithms and dedicated visualizations and interactions for different types of clustering results, allowing more effective exploration of details on demand. Through case studies with a bioinformatics researcher, we received positive feedback on the functionalities of XCluSim, including its ability to help identify stably clustered items across multiple clustering results.
Highlights
Though cluster analysis has become a routine analytic task for bioinformatics research, it is still arduous for researchers to assess the quality of a clustering result
We present XCluSim, a visual analytics tool that enables users to interactively compare multiple clustering results and explore individual clustering results using dedicated visualizations
Lots of prior work on the visual comparison of multiple clustering results employed these techniques [2,5,6,7,8,9,10,11], but we focus our discussion on the ones that are most relevant to us in terms of utilizing ribbon-like bands to represent concordance/discordance among multiple clustering results
Summary
Though cluster analysis has become a routine analytic task for bioinformatics research, it is still arduous for researchers to assess the quality of a clustering result. To select the best clustering method and its parameters for a dataset, researchers have to run multiple clustering algorithms and compare them Such a comparison task with multiple clustering results is cognitively demanding and laborious. Since there is no generally accepted objective metric for selecting the best clustering method and its parameters for a given dataset, researchers often have to run multiple clustering algorithms and compare different results while examining the concordance/discordance among them. Such a comparison task with multiple clustering results for a large dataset is cognitively demanding and laborious
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have