Data loss in wearable sensors is an inevitable problem that leads to misrepresentation during diabetes health monitoring. We systematically investigated missing wearable sensors data to get causal insight into the mechanisms leading to missing data. Two-week-long data from a continuous glucose monitor and a Fitbit activity tracker recording heart rate (HR) and step count in free-living patients with type 2 diabetes mellitus were used. The gap size distribution was fitted with a Planck distribution to test for missing not at random (MNAR) and a difference between distributions was tested with a Chi-squared test. Significant missing data dispersion over time was tested with the Kruskal-Wallis test and Dunn post hoc analysis. Data from 77 subjects resulted in 73 cleaned glucose, 70 HR and 68 step count recordings. The glucose gap sizes followed a Planck distribution. HR and step count gap frequency differed significantly (p < 0.001), and the missing data were therefore MNAR. In glucose, more missing data were found in the night (23:00-01:00), and in step count, more at measurement days 6 and 7 (p < 0.001). In both cases, missing data were caused by insufficient frequency of data synchronization. Our novel approach of investigating missing data statistics revealed the mechanisms for missing data in Fitbit and CGM data.
Read full abstract