Abstract

Cuff-less blood pressure (BP) estimation methods using a deep learning model from photoplethysmogram (PPG) or electrocardiogram (ECG) have been actively studied in recent years. However, we found that most previous studies incur data leakage, where segments or records measured from the same subject appear in both the training and test datasets. Furthermore, many previous studies are suspected to have misinterpreted a record in the public dataset used for their evaluations as a subject. To investigate data leakage in BP estimation methods, this paper first organizes previous studies in terms of data leakage. We then quantitatively evaluate the effect of data leakage caused by the segment-level and the record-level train-test split using the public dataset, Cuff-Less Blood Pressure Estimation Data Set. Our experimental results showed that the segment-level split and record-level split erroneously improved the estimation accuracy of mean blood pressure from the (quasi-)subject-level split by the Pearson’s correlation coefficient of 0.56 and 0.40 when using PPG, and 0.82 and 0.69 when using ECG, respectively. These results confirmed that the train-test split used in many previous studies, including the one that described its evaluation as causing no data leakage, causes a high level of data leakage, and that a record in the Cuff-Less Blood Pressure Estimation Data Set, often misinterpreted as a subject in previous studies, is not a subject and that a high level of data leakage occurs when the record is considered as a subject.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call