Abstract

Medical doctors and researchers in bio-medicine are increasingly confronted with complex patient data, posing new and difficult analysis challenges. These data are often comprising high-dimensional descriptions of patient conditions and measurements on the success of certain therapies. An important analysis question in such data is to compare and correlate patient conditions and therapy results along with combinations of dimensions. As the number of dimensions is often very large, one needs to map them to a smaller number of relevant dimensions to be more amenable for expert analysis. This is because irrelevant, redundant, and conflicting dimensions can negatively affect effectiveness and efficiency of the analytic process (the so-called curse of dimensionality). However, the possible mappings from high- to low-dimensional spaces are ambiguous. For example, the similarity between patients may change by considering different combinations of relevant dimensions (subspaces). We demonstrate the potential of subspace analysis for the interpretation of high-dimensional medical data. Specifically, we present SubVIS, an interactive tool to visually explore subspace clusters from different perspectives, introduce a novel analysis workflow, and discuss future directions for high-dimensional (medical) data analysis and its visual exploration. We apply the presented workflow to a real-world dataset from the medical domain and show its usefulness with a domain expert evaluation.

Highlights

  • Today, experts in medicine, biology, and the life sciences are confronted with increasingly large, and complex and high-dimensional data

  • We focus on the challenges stemming from the high dimensionality often encountered in biomedical datasets

  • Subspace analysis techniques search for various relevant patterns in different subspaces of the original data, such as subspace clustering [5], or subspace nearest neighbor search [6]

Read more

Summary

Introduction

Experts in medicine, biology, and the life sciences are confronted with increasingly large, and complex and high-dimensional data. One of the grand future challenges of biomedical informatics research is to gain knowledge from complex high-dimensional datasets [2] Within such data, relevant and interesting structural and/or temporal patterns (‘‘knowledge’’) are often hidden and not accessible to domain experts. Subspace analysis techniques search for various relevant patterns in different subspaces of the original data, such as subspace clustering [5], or subspace nearest neighbor search [6] While these are useful tools, interpretation of obtained results may be rather challenging for users, as the outcome may involve, e.g., large sets of subspace clusters, many of which contain redundant patterns, or patterns that are not relevant for a specific analysis goal. We propose a Visual Analytics tool, SubVIS, to help explore biomedical patient data by combining Subspace analysis algorithms with interactive VISualization. It helps to answer questions such as what does it mean if a dimension occurs never/very often in different subspaces? Using SubVIS, we present a case study on a real-world immunization dataset, illustrating the benefits of

Background and related work
Cluster analysis
Dimension reduction and subspace analysis
Subspace clustering
Interactive and visual data exploration
Visualizing high-dimensional data
Visualization in bio-medicine and health and distinction of our approach
SubVIS—interactive tool to visually explore subspaces
General overview
Subspace filtering and recomputation
Heatmap
Aggregation table
Table lens
Considered dataset and analysis goals
Analysis perspectives
Data preprocessing
Full-space Experiment 1: clustering
Subspace analysis: initial experiments and results
Background
Subspace experiment 1: combined outcome
Subspace experiment 2: separate outcome
Subspace experiment 3: dimension refinement
Proposed subspace analysis workflow
Discussion
Conclusion and future outlook
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call