Abstract

BackgroundThere is increasing interest in reusing person-generated wearable device data for research purposes, which raises concerns about data quality. However, the amount of literature on data quality challenges, specifically those for person-generated wearable device data, is sparse.ObjectiveThis study aims to systematically review the literature on factors affecting the quality of person-generated wearable device data and their associated intrinsic data quality challenges for research.MethodsThe literature was searched in the PubMed, Association for Computing Machinery, Institute of Electrical and Electronics Engineers, and Google Scholar databases by using search terms related to wearable devices and data quality. By using PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines, studies were reviewed to identify factors affecting the quality of wearable device data. Studies were eligible if they included content on the data quality of wearable devices, such as fitness trackers and sleep monitors. Both research-grade and consumer-grade wearable devices were included in the review. Relevant content was annotated and iteratively categorized into semantically similar factors until a consensus was reached. If any data quality challenges were mentioned in the study, those contents were extracted and categorized as well.ResultsA total of 19 papers were included in this review. We identified three high-level factors that affect data quality—device- and technical-related factors, user-related factors, and data governance-related factors. Device- and technical-related factors include problems with hardware, software, and the connectivity of the device; user-related factors include device nonwear and user error; and data governance-related factors include a lack of standardization. The identified factors can potentially lead to intrinsic data quality challenges, such as incomplete, incorrect, and heterogeneous data. Although missing and incorrect data are widely known data quality challenges for wearable devices, the heterogeneity of data is another aspect of data quality that should be considered for wearable devices. Heterogeneity in wearable device data exists at three levels: heterogeneity in data generated by a single person using a single device (within-person heterogeneity); heterogeneity in data generated by multiple people who use the same brand, model, and version of a device (between-person heterogeneity); and heterogeneity in data generated from multiple people using different devices (between-person heterogeneity), which would apply especially to data collected under a bring-your-own-device policy.ConclusionsOur study identifies potential intrinsic data quality challenges that could occur when analyzing wearable device data for research and three major contributing factors for these challenges. As poor data quality can compromise the reliability and accuracy of research results, further investigation is needed on how to address the data quality challenges of wearable devices.

Highlights

  • IntroductionWith the recent movement toward people (patient)-centered care and the widespread routine use of devices/technologies, person-generated health data (PGHD) have emerged as a promising data source for biomedical research [1]

  • Emerging Biomedical Data—Person-Generated Wearable Device DataWith the recent movement toward people-centered care and the widespread routine use of devices/technologies, person-generated health data (PGHD) have emerged as a promising data source for biomedical research [1]

  • One of the barriers is that there is a lack of studies investigating the data quality challenges of wearable device data for research purposes

Read more

Summary

Introduction

With the recent movement toward people (patient)-centered care and the widespread routine use of devices/technologies, person-generated health data (PGHD) have emerged as a promising data source for biomedical research [1]. Among the different PGHD, data generated through wearable devices are unique in that they are passively, continuously, and objectively collected in free-living conditions; such data are different from those generated through other technologies that require the manual input of data (eg, dietary tracking mobile apps) [4,5,6,7]. Person-generated wearable device data are becoming a valuable resource for biomedical researchers to provide a more comprehensive picture of the health of individuals and populations. Use of Person-Generated Wearable Device Data for Research Purposes. There is increasing interest in reusing person-generated wearable device data for research purposes, which raises concerns about data quality. The amount of literature on data quality challenges, those for person-generated wearable device data, is sparse

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call