Abstract

The paucity of physiological time-series data collected from low-resource clinical settings limits the capabilities of modern machine learning algorithms in achieving high performance. Such performance is further hindered by class imbalance; datasets where a diagnosis is much more common than others. To overcome these two issues at low-cost while preserving privacy, data augmentation methods can be employed. In the time domain, the traditional method of time-warping could alter the underlying data distribution with detrimental consequences. This is prominent when dealing with physiological conditions that influence the frequency components of data. In this paper, we propose PlethAugment; three different conditional generative adversarial networks (CGANs) with an adapted diversity term for the generation of pathological photoplethysmogram (PPG) signals in order to boost medical classification performance. To evaluate and compare the GANs, we introduce a novel metric-agnostic method; the synthetic generalization curve. We validate this approach on two proprietary and two public datasets representing a diverse set of medical conditions. Compared to training on non-augmented class-balanced datasets, training on augmented datasets leads to an improvement of the AUROC by up to 29% when using cross validation. This illustrates the potential of the proposed CGANs to significantly improve classification performance.

Highlights

  • P AUCITY of data and class imbalance drastically hinder the performance of modern machine learning algorithms [1], [2]

  • We propose the use of cMMD values in order to discern interclass differences. Such a granular approach facilitates the identification of potential causal relationships between network/hyperparameter changes and representativeness of synthetic data. This can guide researchers working with conditional generative adversarial networks (GANs)

  • Challenges posed by insufficient medical time-series data which are class-imbalanced can limit the potential of clinical decision support algorithms

Read more

Summary

Introduction

P AUCITY of data and class imbalance drastically hinder the performance of modern machine learning algorithms [1], [2]. The relatively low number of patients enrolled in experimental trials, among other reasons, limits the amount of data collected. This is even more pronounced in low-resource clinical settings where high financial and infrastructural constraints exist. To overcome this obstacle, the use of wearable sensors capable of continuous monitoring of physiological signals such as the photoplethysmogram (PPG) has experienced a rise [3]. Generating class-specific medical time-series data may help in alleviating some of the aforementioned obstacles

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call