Abstract

A critical challenge to using longitudinal wearable sensor biosignal data for healthcare applications and digital biomarker development is the exacerbation of the healthcare “data deluge,” leading to new data storage and organization challenges and costs. Data aggregation, sampling rate minimization, and effective data compression are all methods for consolidating wearable sensor data to reduce data volumes. There has been limited research on appropriate, effective, and efficient data compression methods for biosignal data. Here, we examine the application of different data compression pipelines built using combinations of algorithmic- and encoding-based methods to biosignal data from wearable sensors and explore how these implementations affect data recoverability and storage footprint. Algorithmic methods tested include singular value decomposition, the discrete cosine transform, and the biorthogonal discrete wavelet transform. Encoding methods tested include run-length encoding and Huffman encoding. We apply these methods to common wearable sensor data, including electrocardiogram (ECG), photoplethysmography (PPG), accelerometry, electrodermal activity (EDA), and skin temperature measurements. Of the methods examined in this study and in line with the characteristics of the different data types, we recommend direct data compression with Huffman encoding for ECG, and PPG, singular value decomposition with Huffman encoding for EDA and accelerometry, and the biorthogonal discrete wavelet transform with Huffman encoding for skin temperature to maximize data recoverability after compression. We also report the best methods for maximizing the compression ratio. Finally, we develop and document open-source code and data for each compression method tested here, which can be accessed through the Digital Biomarker Discovery Pipeline as the “Biosignal Data Compression Toolbox,” an open-source, accessible software platform for compressing biosignal data.

Highlights

  • Wearable sensors have the potential to transform health management and healthcare delivery

  • We evaluated the compression ratio and percentage root-mean-square difference of each pipeline applied to common wearable sensor biosignal data, including ECG, PPG, accelerometry (ACC), electrodermal activity (EDA), and skin temperature (TEMP) (Figure 1)

  • We explored the tradeoff between the percentage root-mean-square difference (PRD) and compression ratio (CR) across the five data compression pipelines applied to five different data types with the goal of minimizing the PRD while maximizing the CR (Figure 2)

Read more

Summary

Introduction

Wearable sensors have the potential to transform health management and healthcare delivery. The digital footprint of this biosignal data is growing at an unprecedented rate; the number of connected wearable devices worldwide is expected to reach. Immense data storage capacity is necessary to retain information collected from wearable sensors that continuously monitor multiple biosignals. This “data deluge,” combined with the high costs of data storage and challenges associated with efficient data organization [2,3,4], reveals a critical need to determine how to reduce biosignal data volumes appropriately to retain important information while removing unnecessary or repetitive information [5]. Digital biomarkers are digitally collected data (e.g., a heart rate biosignal from a wrist-worn wearable) that are transformed into indicators of health outcomes (e.g., risk of cardiovascular disease). Digital biomarkers have applications in a number of disease states, including movement-related disorders [6], breast cancer [7], and Alzheimer’s disease [8]

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call