Abstract

In this paper, a new algorithm denoted as FilterK is proposed for improving the purity of k-means derived physical activity clusters by reducing outlier influence. We applied it to physical activity data obtained with body-worn accelerometers and clustered using k-means. We compared its performance with three existing outlier detection methods: Local Outlier Factor, Isolation Forests and KNN using the ground truth (class labels), average cluster and event purity (ACEP). FilterK provided comparable gains in ACEP (0.581→0.596 compared to 0.580-0.617) whilst removing a lower number of outliers than the other methods (4% total dataset size vs 10% to achieve this ACEP). The main focus of our new outlier detection method is to improve the cluster purities of physical activity accelerometer data, but we also suggest it may be potentially applied to other types of dataset captured by k-means clustering. We demonstrate our method using a k-means model trained on two independent accelerometer datasets (training n=90) and re-applied to an independent dataset (test n=41). Labelled physical activities include lying down, sitting, standing, household chores, walking (laboratory and non-laboratory based), stairs and running. This type of clustering algorithm could be used to assist with identifying optimal physical activity patterns for health.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.