VERONICA: Visual Analytics for Identifying Feature Groups in Disease Classification

Neda Rostamzadeh,Eric Mcarthur,Sheikh S Abdullah,Kamran Sedig,Amit X Garg

doi:10.3390/info12090344

Abstract

The use of data analysis techniques in electronic health records (EHRs) offers great promise in improving predictive risk modeling. Although useful, these analysis techniques often suffer from a lack of interpretability and transparency, especially when the data is high-dimensional. The emergence of a type of computational system known as visual analytics has the potential to address these issues by integrating data analysis techniques with interactive visualizations. This paper introduces a visual analytics system called VERONICA that utilizes the natural classification of features in EHRs to identify the group of features with the strongest predictive power. VERONICA incorporates a representative set of supervised machine learning techniques—namely, classification and regression tree, C5.0, random forest, support vector machines, and naive Bayes to support users in developing predictive models using EHRs. It then makes the analytics results accessible through an interactive visual interface. By integrating different sampling strategies, analytics algorithms, visualization techniques, and human-data interaction, VERONICA assists users in comparing prediction models in a systematic way. To demonstrate the usefulness and utility of our proposed system, we use the clinical dataset stored at ICES to identify the best representative feature groups in detecting patients who are at high risk of developing acute kidney injury.

Highlights

A key component of precision medicine is to determine a person’s individualized estimates of different health outcomes, which guides therapy to increase the chance of long-term good health
The Analytics module utilizes the group structure of features stored in electronic health records (EHRs) to identify the subset of feature groups that best represent the data in the prediction of Acute Kidney Injury (AKI)
The Analytics module utilizes the group structure of features stored in EHRs to identify the subset of feature groups that best represent the data in the prediction of module into interactive visual representations to assist users in exploring the results

Summary

Introduction

A key component of precision medicine is to determine a person’s individualized estimates of different health outcomes, which guides therapy to increase the chance of long-term good health. Most of the existing studies use unsupervised learning techniques such as principal component analysis [6], K-means [7,8], and hierarchical clustering [9] to find the best representative group of features in high dimensional EHRs [10–18]. These unsupervised techniques have shown promise in managing high dimensional data, to our best knowledge, this problem has not been studied thoroughly using supervised techniques [19,20]. To identify the subset with the most substantial predictive power, VERONICA considers every possible subset of groups (i.e., groups of features) and applies several supervised learning techniques to each subset.

Background

Visual Analytics

Analytics Module

Interactive Visualization Module

Decision Tree

Support Vector Machines

Naive Bayes

Class Imbalance Problem

Related Work

Design Process and Participants

Data Sources

Cohort Entry Criteria

Response Variable

Input Features

Implementation Details

Workflow

The Design of VERONICA

Analytics

Limitations

Conclusion and Future Work

Discharge Abstract

Findings

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Information	Publication Date: Aug 26, 2021
Citations: 7	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

VERONICA: Visual Analytics for Identifying Feature Groups in Disease Classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information

Lead the way for us

Similar Papers

Author response: Early prediction of level-of-care requirements in patients with COVID-19
Boran Hao ... George C Velmahos
-
Boran Hao, et. al.Boran Hao ... George C Velmahos
24 Sep 2020
24 Sep 2020

Decision letter: Early prediction of level-of-care requirements in patients with COVID-19
Evangelos J Giamarellos-Bourboulis
-
Evangelos J Giamarellos-BourboulisEvangelos J Giamarellos-Bourboulis
13 Aug 2020
13 Aug 2020

Big Data, Predictive Analytics, and Quality Improvement in Kidney Transplantation: A Proof of Concept.
T.R Srinivas ... A Tripathi
American Journal of Transplantation | VOL. 17
T.R Srinivas, et. al.T.R Srinivas ... A Tripathi
04 Jan 2017
American Journal of Transplantation | VOL. 17

EHR STAR: The State‐Of‐the‐Art in Interactive EHR Visualization
Q Wang ... R.S Laramee
Computer Graphics Forum | VOL. 41
Q Wang, et. al.Q Wang ... R.S Laramee
01 Dec 2021
Computer Graphics Forum | VOL. 41

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

VERONICA: Visual Analytics for Identifying Feature Groups in Disease Classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information